We have examined basal and luminal cell cytokeratin expression in 1944 cases of invasive breast carcinoma, using tissue microarray (TMA) technology, to determine the frequency of expression of each cytokeratin subtype, their relationships and prognostic relevance, if any. Expression was determined by immunocytochemistry staining using antibodies to the luminal cytokeratins (CKs) 7/8, 18 and 19 and the basal markers CK 5/6 and CK 14. Additionally, assessment of α-smooth muscle actin (SMA) and oestrogen receptor status (ER) was performed. The vast majority of the cases showed positivity for CK 7/8, 18 and 19 indicating a differentiated glandular phenotype, a finding associated with good prognosis, ER positivity and older patient age. In contrast, basal marker expression was significantly related to poor prognosis, ER negativity and younger patient age. Multivariate analysis showed that CK 5/6 was an independent indicator for relapse free interval. We were able to subgroup the cases into four distinct phenotype categories (pure luminal, mixed luminal/basal, pure basal and null), which had significant differences in relation to the biological features and the clinical course of the disease. Tumours classified as expressing a basal phenotype (the combined luminal plus basal and the pure basal) were in a poor prognostic subgroup, typically ER negative in most cases. These findings provide further evidence that breast cancer has distinct differentiation subclasses that have both biological and clinical relevance. Copyright © 2004 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.