Because DSM-IV cocaine dependence (CD) is heterogeneous, it is not an optimal phenotype to identify genetic variation contributing to risk for cocaine use and related behaviors (CRBs). We used a cluster analytic method to differentiate homogeneous, highly heritable subtypes of CRBs and to compare their utility with that of the DSM-IV CD as traits for genetic association analysis. Clinical features of CRBs and co-occurring disorders were obtained via a poly-diagnostic interview administered to 9,965 participants in genetic studies of substance dependence. A subsample of subjects (N = 3,443) were genotyped for 1,350 single nucleotide polymorphisms (SNPs) selected from 130 candidate genes related to addiction. Cluster analysis of clinical features of the sample yielded five subgroups, two of which were characterized by heavy cocaine use and high heritability: a heavy cocaine use, infrequent intravenous injection group and an early-onset, heavy cocaine use, high comorbidity group. The utility of these traits was compared with the CD diagnosis through association testing of 2,320 affected subjects and 480 cocaine-exposed controls. Analyses examined both single SNP (main) and SNP–SNP interaction (epistatic) effects, separately for African-Americans and European-Americans. The two derived subtypes showed more significant P values for 6 of 8 main effects and 7 of 8 epistatic effects. Variants in the CLOCK gene were significantly associated with the heavy cocaine use, infrequent intravenous injection group, but not with the DSM-IV diagnosis of CD. These results support the utility of subtypes based on CRBs to detect risk variants for cocaine addiction. © 2013 Wiley Periodicals, Inc.