To perform electrical discharge tests, I use a programmable load and an arduino DAQ circuit that records temperature and voltage information.
I also teardown batteries before and after the failure test to assess their construction quality and get the 18650 cell manufacturing information when available.
Then I use this information to score the batteries relative to eachother on honesty, durability, capacity, power, and safety.
And last I assign an S/A/B/C/F tier score based on their category ratings and my assessment.

Honesty:
A measurement of how much the seller is lying about their pack capacity. If capacity is the same or greater than their listing, 5-stars. Otherwise their score is reduced in proportion to the lie.
Durability:
A measurement of physical build quality and electrical durability under strenuous use.
1/2 of the score comes from the teardown: is the electrical routing robust? Are the tool contacts flimsy and prone to failure? Are there red-flags?
1/2 the score comes from how many cycles of the Performance Test the battery survived, relative to the rest of the batteries tested.
Capacity:
The measured batteries are partitioned into 4 equal-width bins based on their capacity in Wh. Scores of 2, 3, 4, and 5-stars are assigned based on how they compare to the other batteries.
No batteries with abysmal / unusable capacity have been tested yet.
Power:
A score that reflects whether the tool can be used to run circular saws, chainsaws, etc. The peak-sustained-power reported from the failure test, and the number of cycles from the performance test, are used. Batteries that cannot achieve 800W output in the failure test or <= 5 cycles in the performance test are given 0.5 stars: 800W output is required for high-power tools.
Safety:
Hand-scored, I look for a few details in determining pack safety:
- Sketchy 18650 cells: unlabeled, aftermarket, or recycled cells get bad scores in safety
- Smoke during the cycling test: under normal use, packs should shut off (based on their thermistor values) before they start smoking. If a pack started melting or smoking during the cycling test, or a cell vents, that's red flag
- Bad electrical construction: if I deem the pack is at risk of shorting, melting something internally, or behaving unexpectedly, it scores lower
- Presence of engineered safety features: Waitley has a reliable high-current fuse, which is a nice touch not present in other batteries, and distinguishes 4.5 vs 5 star ratings
