An established ingredient in the security evaluation of cryptographic devices is leakage detection, whereby physically observable characteristics such as the power consumption are measured during operation and statistically analysed in search of sensitive data dependencies. However, depending on its precise execution, this approach potentially suffers several drawbacks: a risk of false positives, a difficulty interpreting negative outcomes, and the infeasibility of covering every possible eventuality. Moreover, efforts to mitigate for these drawbacks can be costly with respect to the data complexity of the testing procedures. In this work, we clarify the (varying) goals of leakage detection and assess how well-geared current practice is towards meeting each of those goals. We introduce some new innovations on existing methodologies and make recommendations for best practice. Ultimately, though, we find that many of the obstacles cannot be fully overcome according to existing statistical procedures, so that it remains to be highly cautious and to clearly state the limitations of the methods used when reporting outcomes.