In binary decision diagram–based fault tree analysis, the size of binary decision diagram encoding fault trees heavily depends on the chosen ordering. Heuristics are often used to obtain good orderings. The most important heuristics are depth-first leftmost (DFLM) and its variants weighting DFLM (WDFLM) and repeated-event-priority DFLM (RDFLM). Although having been used widely, their performance is still only vaguely understood, and not much formal work has been done. This article firstly identifies some basic requirements for a reliable benchmark and gives a benchmark generation method. Then, using the generated benchmark, the performance of DFLM and its variants is studied. Both the experimental results and some interesting findings for our research questions are proposed. This article also presents a new weighting DFLM (NWDFLM) heuristic and the underlying basic ideas and gives both the experimental results and conclusions on the performance comparison. As a final synthesis of all previous results, a practical suggestion of the order of heuristic selection to process a large fault tree is NWDFLM < WDFLM < RDFLM. Copyright © 2012 John Wiley & Sons, Ltd.