Canonical Kaiso target genes define a functional signature that associates with breast cancer survival and the invasive lobular carcinoma histological type

Invasive lobular carcinoma (ILC) is a low‐ to intermediate‐grade histological breast cancer type caused by mutational inactivation of E‐cadherin function, resulting in the acquisition of anchorage independence (anoikis resistance). Most ILC cases express estrogen receptors, but options are limited in relapsed endocrine‐refractory disease as ILC tends to be less responsive to standard chemotherapy. Moreover, ILC can relapse after >15 years, an event that currently cannot be predicted. E‐cadherin inactivation leads to p120‐catenin‐dependent relief of the transcriptional repressor Kaiso (ZBTB33) and activation of canonical Kaiso target genes. Here, we examined whether an anchorage‐independent and ILC‐specific transcriptional program correlated with clinical parameters in breast cancer. Based on the presence of a canonical Kaiso‐binding consensus sequence (cKBS) in the promoters of genes that are upregulated under anchorage‐independent conditions, we defined an ILC‐specific anoikis resistance transcriptome (ART). Converting the ART genes into human orthologs and adding published Kaiso target genes resulted in the Kaiso‐specific ART (KART) 33‐gene signature, used subsequently to study correlations with histological and clinical variables in primary breast cancer. Using publicly available data for ERPOSHer2NEG breast cancer, we found that expression of KART was positively associated with the histological ILC breast cancer type (p < 2.7E‐07). KART expression associated with younger patients in all invasive breast cancers and smaller tumors in invasive ductal carcinoma of no special type (IDC‐NST) (<2 cm, p < 6.3E‐10). We observed associations with favorable long‐term prognosis in both ILC (hazard ratio [HR] = 0.51, 95% CI = 0.29–0.91, p < 3.4E‐02) and IDC‐NST (HR = 0.79, 95% CI = 0.66–0.93, p < 1.2E‐04). Our analysis thus defines a new mRNA expression signature for human breast cancer based on canonical Kaiso target genes that are upregulated in E‐cadherin deficient ILC. The KART signature may enable a deeper understanding of ILC biology and etiology. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.


Introduction
Mutational inactivation of E-cadherin and a subsequent dismantling of the cellular adhesive structure called the adherens junction (AJ) causes the development and progression of invasive lobular carcinoma (ILC) (reviewed in [1]).Due to the functional inactivation of E-cadherin-based cell-cell contacts, ILC cells invade diffusely as noncohesive cells in trabecular structures or in a single-file pattern [2].Recent evidence has established that E-cadherin inactivation leads to increased and concomitant activation of PI3K/AKT signals independently of activating mutations [3][4][5], which provide a mechanism for the previously observed AKT phosphorylation in lobular breast cancers [6].In addition, E-cadherin loss leads to direct translocation of p120-catenin (p120) from the cell membrane to the cytoplasm and nucleus [7][8][9], a process that is further promoted by loss of cell-matrix contacts [10].In healthy cells, the main function of p120 includes providing stability for all classical cadherin family members, including E-cadherin [11,12], and spatial regulation of RhoA-dependent actomyosin contraction at the cleavage furrow during cytokinesis [13].Importantly, p120 can also bind and relieve transcriptional repression of Kaiso (ZBTB33), a bimodal regulator of transcription [14].Kaiso can repress mRNA transcription through binding to the canonical Kaiso-binding sequence (cKBS) TCCTGCNA or modulate a noncanonical function through the CGCG-containing consensus KBS TCTCGCGAGA [15,16].
The ability of p120 to relieve canonical Kaiso-dependent transcriptional repression relies on nuclear translocation of p120 through a conserved nuclear localization sequence (NLS) [17,18].Canonical p120-dependent KBS targets such as Siamois [19], Wnt11 [20], and Cyclin D1 [19,21] have been associated with tissue homeostasis in frogs, flies, and/or mammalian cells.Although the exact mechanism remains unclear, nuclear influx of p120 increases upon transfer of cells to anchorage-independent culture conditions [10].Under these conditions, the cKBS target Wnt11 is transcribed as a direct consequence of E-cadherin loss and Kaiso de-repression by p120, to subsequently promote autocrine RhoA-dependent-anoikis resistance in lobular breast cancer cells [10].
Although the histological subtypes of ILC and invasive ductal carcinoma of no special type (IDC-NST) display distinct molecular landscapes [6,22,23], this has not yet evolved into specific classifiers that can contribute to the development of a tailored clinical intervention for ILC.Here, we present a functional and experimentally derived transcriptional signature based on 33 canonical Kaiso target genes (KART) that associates with the lobular histological subtype.We show that high KART signatures associate with a favorable prognosis in breast cancer, irrespective of the histological type.

ERBB3 promoter analysis
Promoter analysis of human ERBB3 was performed using the Eukaryotic Promoter Database (EPD) [24], with evaluation of the 1,000 upstream base pairs of the gene and exploration of the ZBTB33 (Kaiso)-binding sites (full KBS sequence TCCTGCNA and core KBS sequence CTGCNA), and plotted using Adobe Illustrator (Adobe Inc, San Jose, CA, USA).Promoter analysis of mouse Erbb3 gene was performed using the NCBI Genome Data Viewer (NCBI, Bethesda, MD, USA) [25].

Cell culture
MCF7 cells were obtained from the American Type Culture Collection (ATCC, Manassas, VA, USA), STR-type verified by PCR, and cultured as described previously [9].Generation of E-cadherin knockout MCF7::ΔCDH1 cells using CRISPR-Cas9 editing has been described previously [5].Designed primer pairs are shown in supplementary material, Table S1.

CRISPR/Cas9-mediated knockout of kaiso (ZBTB33)
Baculoviruses expressing Cas9 were produced as described previously [28].Two guide (g)RNAs were tested in MCF7 cells, and after tracking of indels by decomposition analysis [29] the cells with the highest gRNA efficiency were clonally expanded and screened for Kaiso expression using western blotting.

RT-qPCR
Total RNA was extracted from cell or organoid pellets using Trizol (15596026, Thermo Fisher Scientific, Bleiswijk, The Netherlands).Poly-T primers and a cDNA reverse transcription kit (iScript Synthesis kit, 1708890, BioRad, Lunteren, The Netherlands) were used to generate cDNA.Two ERBB3-specific primer sets (supplementary material, Table S1) were used to evaluate expression values for human ERBB3 using PCR.Primer efficiency was assessed by serial dilution.Expression values were generated using ΔΔCt values normalized to GAPDH for control and ERBB3.Experiments were performed in triplicate over three independent biological and technical settings using the BioRad CFX96 Real-Time System (Bio-Rad Laboratories Hercules, CA, USA) and BioRad CFX manager software (Bio-Rad Laboratories).For each comparison, unpaired two-tailed Student's t-tests were used to determine statistical significance.

Bioinformatics analysis of patient data
The METABRIC [30] dataset, including clinical data and normalized gene expression, and the clinical data of The Cancer Genome Atlas (TCGA) dataset [31] were retrieved through cBioPortal [32] in August 2022.The TCGA normalized expression data were retrieved from the GDC data portal (https://gdc.cancer.gov/) in September 2021.Breast cancer types were assessed according to ER immunohistochemistry and ERBB2/HER2 immunohistochemistry or FISH status.Only patients with ER-positive/HER2-negative IDC-NST and ILC breast cancers were kept in the downstream analysis.This resulted in data from 227 IDC-NST and 98 ILC patients for TCGA and data from 979 IDC-NST and 118 ILC patients in METABRIC.Gene expression signatures were retrieved from the literature (ESR1_signature, AURKA, PLAU, STAT1 [33]; DCN [34]; gene expression grade index (GGI) [35]; SDDP [36]; Immune_Perez [37]; IRM [38]; immune cell signatures [39]; GENE21 [40]; LobSig [41], and computed as described in [42]).Wilcoxon tests were performed to compare continuous to categorical variables.Correlations were assessed using Spearman coefficients.In the heat maps, only significant correlations are colored: red, anticorrelated; blue correlated.Associations with disease-free survival (DFS) and overall survival (OS) were assessed using stratified log-rank test and Cox proportional hazard regression.Multivariable Cox models were adjusted for age (>50 versus ≤50), tumor size (≥2 versus <2), and nodal status (positive versus negative).The follow-up was curtailed at 8 and 20 years for the TCGA and METABRIC datasets, respectively, in consideration of the declining numbers of patients after that time point.P values were two-sided and statistical significance considered for p < 0.05.All analyses were performed using R version 4.0.2(R Foundation for Statistical Computing, Vienna, Austria) [43].

Defining an anchorage-independent transcriptome driven by E-cadherin loss and canonical Kaiso target genes
We have used mouse ILC cells to couple anchorage independence upon loss of E-cadherin to the activation of a gene set in anchorage-independent (suspension) conditions [10,44].Because mutational E-cadherin inactivation leads to a p120-dependent activation of Kaiso target genes (Figure 1A), we had used this gene set to define a set of 27 candidate mouse Kaiso targets based on the presence of a cKBS within a region 1 kb upstream of the transcriptional start site (TSS) [10].These mouse ILC candidate Kaiso targets [10] were converted to human orthologs, and the presence of cKBS sites in these promoters was confirmed using the EPD (SIB) [24] (Table 1).
Because ERBB3 activation recently was associated with ILC [23] and the ERBB3 promoter contains a cKBS at position -904 (Figure 1B), we next verified whether ERBB3 was a cKBS Kaiso target gene.For this we performed chromatin immunoprecipitations (ChIP) with monoclonal Kaiso antibodies, followed by an ERBB3 promoter-specific PCR, which demonstrated that ERBB3 was a bona fide canonical Kaiso target gene in E-cadherin-negative (MDA-MB-231) and E-cadherinpositive (MCF7) cells (Figure 1C).Next, we generated CRISPR/Cas9-mediated MCF7 Kaiso knockout cells (MCF7::ΔKaiso) using primers described in supplementary material, Table S1 (Figure 1D), and observed that Kaiso knockout induced a sixfold increase in ERBB3 mRNA expression compared to control cells (Figure 1E).Moreover, E-cadherin loss or Kaiso knockout both increase ERBB3 protein expression by 2.1-and 5.0-fold, respectively (Figure 1F).In sum, we have identified ERBB3 as a canonical Kaiso target, which we added to the ART signature (Table 1).

Expression of KART signature associates with histological subtype ILC and clinicopathological parameters in breast cancer
We next analyzed expression of the KART signature in the publicly available ER POS /HER2 NEG mRNA expression datasets from TCGA (IDC-NST: n = 227 and ILC: n = 98) and METABRIC (IDC-NST: n = 979 and ILC: n = 118) to probe for possible associations with clinicopathological parameters.We found that the KART signature was positively associated with the ILC subtype in both the TCGA (p = 1.1E-11) and METABRIC (p = 2.7E-07) datasets (Figure 2A).Interestingly, KART is also associated with younger age at diagnosis (<50 years), both in patients with ILC (p = 8.3E-02 and p = 1.8E-02 in TCGA and METABRIC, respectively) and IDC-NST (p = 1.1E-02 and p = 5.5E-08 in TCGA and METABRIC, respectively) (Figure 2B,C), and smaller tumors in IDC-NST only (p = 2.1E-02 and p = 6.3E-10 in TCGA and METABRIC, respectively) (Figure 2B,C).In METABRIC, KART positively associates with lowgrade tumors in both the ILC and IDC-NST cohorts (p = 9.8E-03 and p = 6.7E-05) (Figure 2C).In short, we show that the KART signature associates with the histological subtype ILC, younger patients, lower-grade tumors in all breast cancers, and smaller tumor size in the histological group IDC-NST.
A Kaiso signature for lobular breast cancer

Individual contributions of KART signature genes to clinical associations
To investigate how much individual KART genes contribute to breast cancer histological subtypes, we analyzed the expression of the separate genes associated with ductal or lobular diagnosis in the TCGA breast cancer dataset.From the nine most abundantly expressed KART genes, seven individual genes (TACSTD2, ERBB3, FOS, MYC, ID2, ALDH3B2, and CAMK2N1) are significantly associated with ILC (p ≤ 0.05), while expression levels of CCND1 and TMEM176B have no specific association with either IDC-NST or ILC ( p ≥ 0.05) (Figure 3).In total, there are 22 upregulated genes within KART that contribute to the specific association with the histological subtype ILC (Figure 3 and supplementary material, Figure S2).

KART expression correlates with stromal and immune expression signatures in breast cancer
Next, we investigated whether the KART signature correlates with specific known transcriptional signatures related to estrogen signaling, proliferation, and immunity [33,[35][36][37][38][39].We defined correlations based on Spearman coefficients, where red squares represent anticorrelations and blue squares represent positive correlations, as previously described [48].In TCGA, KART positively correlated with the stroma derived prognostic predictor (SDPP) and DCN.up stromal signatures (0.51 and 0.41), the Perez immune signature (0.58), and the PLAU invasion signature (0.41), and anticorrelates with ESR1 (À0.34) and AURKA proliferation signatures (À0.43) in IDC-NST (Figure 4 and supplementary material, Figure S3).Similar correlations were found in the METABRIC dataset, but with an additional correlation with the DCN stromal signature (0.49) (Figure 4).Anticorrelations with the ESR1 (À0.53) and AURKA (À0.49) modules were also noted for ILC in TCGA, while the SDPP stromal module (0.41) and the Perez immune module (0.53) positively correlated with KART for ILC (Figure 4 and supplementary material, Figure S3).In METABRIC, KART was anticorrelated with GGI grading (À0.5) while positively correlating with the DCN stroma signature for ILC (0.46) (Figure 4 and supplementary material, Figure S3).We then examined the correlations of the established LobSig signature [41] to the aforementioned signatures including KART.We found a uniform inverse correlation between KART and LobSig for both the TCGA (supplementary material, Figure S3A) and METABRIC datasets (supplementary material,

Figure S3B
).Overall, in both datasets, positive correlation for KART in invasion and immune signatures are linked to negative correlations for LobSig, whereas the negative correlations for KART with proliferation and estrogen signature are positively linked to LobSig (supplementary material, Figure S3).In short, KART correlates, independently of the histopathological subtype, with stromal, immune,

482
T Sijnesael, F Richard, MAK Rätze et al and invasion signatures, but anticorrelated with proliferation, grading, and the estrogen receptor signature.

KART associates with a favorable DFS and OS in breast cancer
The association of the KART signature with the ILC histological type and the fact that ILC is a generally low-grade disease prompted us to assess whether KART was associated with favorable prognosis in breast cancer survival.We observed that KART was prognostic for improved DFS in the TCGA IDC-NST dataset in multivariable analyses (hazard ratio [HR] = 0.50, 95% CI = 0.23-1.09,p = 7.1E-03) (Figure 5A and supplementary material, Table S2) when considering KART expression around the median.The direction of this association was conserved for OS; however, significance was not reached (HR = 0.72, 95% CI = 0.28-1.81,p = 1.5E-01) (Figure 5B and supplementary material, Table S2).For ILC, higher KART expression was significantly associated with better prognosis in multivariable analyses for DFS in TCGA (HR = 0.18, 95% CI = 0.04-0.76,p = 2.5E-02) (Figure 5C and supplementary material, Table S2).This trend was also observed for OS in TCGA (HR = 0.10, 95% CI = 0.02-1.00,p = 7.6E-03) (Figure 5D and supplementary material, Table S2).In the METABRIC dataset, we found that higher KART expression associated with better OS in multivariable analyses for both the IDC-NST (HR = 0.79, 95% CI = 0.66-0.93,p = 1.2E-04) and the ILC (HR = 0.51, 95% CI = 0.29-0.91,p = 3.4E-02) cohorts (Figure 6A-C and supplementary material, Table S2).In short, we found that high KART expression predicted favorable prognosis for

Discussion
The transcriptional modifier Kaiso has been implicated in cancer due to its role as a transcriptional regulator of genes such as CCND1 [19,21], MYC [19], WNT11 [10,20], MMP7 [46,47], and ID2 [44].In this study, we combined candidate and established Kaiso target genes that were either detected in the context of anchorageindependent mouse ILC cells or Kaiso targets that were functionally identified and verified.Interestingly, several canonical Kaiso targets (genes regulated through the cKBS consensus) from these studies play a role in the regulation of Wnt signaling.Both canonical and noncanonical Wnt targets are mostly associated with cellular differentiation, development, or specific functions of terminally differentiated cells.Examples are Rapsyn, a critical effector of acetylcholine receptor signals in the neuromuscular junction [49], and Wnt11, Cyclin D1, and Myc, which control a plethora of developmental and differentiation processes in breast and other tissues [50][51][52][53].
Although Cyclin D1 is essential for cell cycle progression, it is well known that high expression of Cyclin D1 is strongly associated with low-grade luminal-type breast cancers [54,55].Given that Cyclin D1 is the second most abundantly expressed cKBS target in all breast cancers, its high expression likely strongly contributes to our finding that KART is positively associated with better OS and DFS in breast cancer regardless of the histological type.The relatively weak individual contribution of Cyclin D1 to histological differentiation between IDC-NST and ILC also corroborates this assumption, although a study by Tobin et al indicated that low Cyclin D1 protein expression in ILC associated with improved outcome [56].In contrast, ID2 showed a lower level of expression (ranked 7/33) but strongly associated with the ILC phenotype.These findings are in line with reports that link ID2 as an inhibitor of ILC cell proliferation while facilitating anoikis resistance, a hallmark of metastatic ILC [44,57].
Concomitant high ID2 and Cyclin D1 expression in ILC [54,56,58,59] might be explained by the finding that

484
T Sijnesael, F Richard, MAK Rätze et al A Kaiso signature for lobular breast cancer 485 cytosolic ID2 functions as a CDK4/6 antagonist in ILC cells through binding to hypo-phosphorylated Rb and subsequent dampening of cell cycle progression [44].In this context, ID2 functions as a CDK4/6 inhibitor, which potentially induces a compensatory upregulation of Cyclin D1.Because high Cyclin D1 expression levels mostly associate with favorable short-term prognosis in ILC [54] and CDK4/6 inhibitors induce an upregulation of Cyclin D1 in lobular-type breast cancer cell lines [60], we postulate that the Kaiso target ID2 may propel further compensatory upregulation of Cyclin D1 expression.This codependence may therefore partly underpin the observed associations of the KART signature with good prognosis and low proliferation in ILC.
Our work has identified ERBB3 as a cKBS Kaiso target gene.Interestingly, somatic oncogenic ERBB3 mutations have recently been linked to ILC by multiple studies [23,41,61,62].It is well established that loss of E-cadherin leads to autocrine activation of growth factor receptor signals [3,4].Since E-cadherin-dependent cell-cell adhesion inhibits activation of multiple receptor tyrosine kinases [3,4,63,64] and ILC is a slow proliferating disease, we hypothesize that ERBB3 mostly propels pro-survival cues in ILC through PI3K/AKT.Because ERBB3 signaling depends on heterodimerization and activation through other ERBB family members [65,66], it is tempting to speculate that the Kaiso target ERBB3 may conspire with low levels of ERBB2 to foster sustained AKT activation and subsequent anchorage independence in metastatic ILC.
Although ILC is unfortunately still mostly defined as a histomorphological breast cancer type, it has become evident that lobular carcinoma is in fact a unique entity within the breast cancer spectrum.Apart from the obvious phenotypical differences, multiple studies have shown that ILC presents a specific genomic and

486
T Sijnesael, F Richard, MAK Rätze et al transcriptional landscape [23,[67][68][69] that drives a distinct biochemistry [3,4,6,70,71].As a result, the identification of targetable oncological cues has instigated ILC-specific trials that require reproducible inclusion criteria for ILC.Interestingly, a recent multicenter concordance study showed that interobserver agreement for histological differential breast cancer diagnosis for ILC is moderate if no additional immunohistochemistry for E-cadherin status is provided [72].Given the essential roles of E-cadherin loss within ILC etiology, we think that the KART signature represents a functional supportive tool that, on a biological basis and through unbiased selection and clinical associations, results in a better understanding of ILC biology and, potentially, future predictions regarding treatment responses.
To conclude, we present the cKBS KART, a novel 33-gene signature based on a combined set of functional and experimentally established, as well as candidate canonical Kaiso target genes.High KART expression associates with the ILC breast cancer type and is associated with improved long-term OS and DFS in IDC-NST and ILC.

Figure 1 .
Figure 1.ERBB3 is a direct Kaiso transcriptional target gene in breast cancer cells.(A) Control over p120-dependent Kaiso transcriptional repression in the context of E-cadherin expression.Cartoon depicting E-cadherin-positive ductal breast cancer (left panel), where p120-catenin (p120) is retained at the membrane in the adherens junction (AJ) complex.In this setting, Kaiso represses expression of its target genes by KBS-dependent binding.In E-cadherin mutant lobular breast cancer (right panel), p120 is translocated to the cytosol and nucleus.Under anchorage-independent (suspension) conditions, nuclear influx of p120 is increased approximately twofold, leading to a relief of Kaiso transcriptional repression.(B) Shown is the ERBB3 promoter region (up to À1,000 base pairs) that was analyzed for the presence of canonical Kaiso binding sites (cKBS; CTGCNA), plotted in base pairs upstream of the transcription start site (TSS).(C) Kaiso specific chromatin immunoprecipitation (ChIP) and subsequent PCR analyses on lysates from MDA-MB-231 (MM231) and MCF7 reveal that Kaiso binds the ERBB3 promoter.(D) Western blot analysis depicting loss of Kaiso expression upon knockout in MCF7 cells (MCF7::ΔKaiso).AKT was used as loading control.(E) Bar graph showing ERBB3 mRNA expression in parental MCF7 and MCF7::ΔKaiso.**** = p < 0.0001.(F) Kaiso controls ERBB3 protein levels.Shown are western blot analyses of ERBB3 expression levels in parental MCF7 cells, MCF7::ΔKaiso cells, and MCF7:: ΔCDH1 cells grown under adherent conditions.The fold increase in ERBB3 expression relative to the loading control (AKT) is shown below the ERBB3 blot in bold typeface.

Figure 2 .
Figure 2. The KART signature associates with the histological subtype ILC, with young patients, small tumors, and low-to intermediate-grade tumors.(A) Shown are the associations of normalized KART expression with IDC-NST and ILC in the TCGA (left) and METABRIC (right) datasets.All points are confirmed ER POS and HER2 NEG breast cancers.TCGA: IDC-NST n = 227; ILC n = 98.(B and C) Associations of normalized KART expression with clinicopathological features in breast cancer.Shown are violin plots for age, tumor size, and grading associations for IDC-NST (green bullets) and ILC (pink bullets) in the TCGA (B) and METABRIC (C) cohorts.All statistics were performed using the Wilcoxon signed-rank test.

Figure 3 .
Figure 3. Individual gene contributions to KART signature and their associations to ILC histological phenotype.Violin plots showing nine highest expressed genes within KART in TCGA database in IDC-NST (green bullets) and ILC (pink bullets), ranked on overall mRNA expression levels.The dashed blue lines represent the 97th percentile of the sequenced mRNA.Statistics were performed using the Wilcoxon signedrank test.

Figure 4 .
Figure 4.The KART signature positively correlates with stroma, invasion, and immune expression signatures and negatively with proliferation expression profiles.Condensed heat maps showing significant gene expression signature correlations for IDC-NST and ILC in the TCGA and METABRIC databases.Numbers inside squares represent Pearson coefficients.Red indicates negative correlations, blue represents positive correlations, white represents no significant association.

Figure 5 .
Figure 5. KART expression correlates with DFS and OS for IDC-NST and ILC in TCGA dataset.(A-D) Kaplan-Meier graphs depicting DFS (A and C) and OS (B and D) for IDC-NST (A and B) or ILC patients respectively (C and D) regarding KART expression categorized around median with corresponding forest plots.Significant p values are depicted in bold.IDC-NST: n = 227 and ILC: n = 98.

Figure 6 .
Figure 6.KART expression associates with long-term OS for IDC-NST and ILC in METABRIC dataset.(A and B) OS Kaplan-Meier curves for IDC-NST (A) and ILC (B) cohorts in METABRIC dataset regarding KART expression categorized around median with corresponding forest plots.Significant p values are depicted in bold.(C) Table depicting longitudinal patient numbers of the data shown in (A) and (B).IDC-NST: n = 979 and ILC: n = 118.
*Validated by ChIP and/or functional analyses.