Standardizing analysis of intra‐tumoral heterogeneity with computational pathology

Many malignant cancers like glioblastoma are highly adaptive diseases that dynamically change their regional biology to survive and thrive under diverse microenvironmental and therapeutic pressures. While the concept of intra‐tumoral heterogeneity has become a major paradigm in cancer research and care, systematic approaches to assess and document bio‐variation in cancer are still in their infancy. Here we discuss existing approaches and challenges to documenting intra‐tumoral heterogeneity and emerging computational approaches that leverage artificial intelligence to begin to overcome these limitations. We propose how these emerging techniques can be coupled with a diversity of molecular tools to address intra‐tumoral heterogeneity more systematically in research and in practice, especially across larger specimens and longitudinal analyses. Systematic documentation and characterization of heterogeneity across entire tumor specimens and their longitudinal evolution has the potential to improve our understanding and treatment of cancer.


| INTRODUCTION
Consensus in the field of cancer research and care emphasizes a need for personalized and precision-based medicine approaches, where specific therapies are chosen based on each patient tumor's molecular makeup, to overcome resistance. 1,2 Despite the fervent efforts of the medical and research community chasing this ideal, improvements in outcomes have largely fallen short of the high expectations. 3 It is now becoming clear that deeper layers of tumoral heterogeneity, largely within individual tumors, potentially pose an even more significant barrier to personalized medicine approaches in cancer. [2][3][4][5] Cancer progression is shaped by dysregulation of normal cellular processes through the acquisition of stochastic mutations 4 and regional microenvironmental pressures that can vary from patient-to-patient (inter-tumoral heterogeneity), within individual tumors (intra-tumoral heterogeneity), and following metastasis, treatment, and recurrence (longitudinal heterogeneity) 6,7 ( Figure 1). The aggregate of these multiple spatiotemporal pressures ultimately drives expression of distinct cellular programs within different tumor subpopulations. 6 These diverse biological subclones may respond heterogeneously to both traditional and modern anti-cancer therapies, which may further drive their evolution and emergence of resistant cell populations. 4,8 For this discussion, we will focus on the most common and aggressive form of adult brain cancer, glioblastoma, IDH-wildtype to illustrate these principles.
Glioblastoma has an expected median survival of 15-18 months, a statistic that has remained largely unchanged in numerous decades due to rapid local recurrence following spirited multimodal intervention. 9 The relative confinement of recurrences within the brain emphasizes how difficult it may be to contain even localized levels of intra-tumoral heterogeneity. 10 However, like in many other cancers types, most clinical and research-based molecular analyses of glioblastoma have so far relied primarily on single-region and bulk-based tissue analysis, limiting insights into heterogeneity. [11][12][13][14] It is potentially the inferences and assumptions made from these analyses that, when applied to the entire cancer including recurrence/metastasis, makes current precision medicine approaches brittle. 5 As our understanding of spatial and longitudinal evolutional pressures of cancer have recently significantly evolved, we must now begin to consider how they can be incorporated within our existing profiling and treatment paradigms. [4][5][6][7] Multiregion sampling of tumors and their recurrences appears to be a necessary adaption, but where and what to sample may become a subjective exercise that needs to be standardized.
Excitingly, managing this bio-variation may be a tangible and costeffective exercise for existing computational tools including artificial intelligence (AI). [15][16][17][18] Here we provide an overview of glioblastoma heterogeneity and discuss how modern deep learning pipelines show promise in resolving bio-variation in a cost-effective and objective manner. Finally, we discuss the need to complement supervised classifiers with unsupervised methods to help further personalize analysis of heterogeneity and extend it to facilitate resolution of the longitudinal evolution to therapy. 16 2 | THE MULTILAYERED ORGANIZATION OF GLIOBLASTOMA HETEROGENEITY

| Inter-tumoral heterogeneity
While once viewed as a largely homogenous disease, it is now understood that specific cancer types can show heterogeneity across cellular, genetic, epigenetic, transcriptional and proteomic dimensions 5,6 Large-scale sequencing efforts over the past two decades by initiatives such as The Cancer Genome Atlas (TCGA) helped establish important patient-to-patient molecular differences of cancers, like glioblastoma, that were traditionally grouped together based on shared histologic features (inter-tumoral heterogeneity). 19 While every cancer is expected to show some degree of heterogeneity, it is reoccurring patterns that have reliable prognostic (differences in clinical outcomes) and predictive (differences in response to a specific therapy) value that dominate modern classification systems. 20 For example, a small subset of glioblastoma (5%) harboring mutations in the isocitrate dehydrogenase 1 or 2 (IDH1/2) genes show a prolonged survival (31 months), as compared to the more common IDH-wildtype glioblastoma (15 months). [21][22][23] This association is so robust that even IDH-wildtype adult-type diffuse gliomas that do not have hallmark histological features of malignancy (e.g., microvascular proliferation and/or necrosis) tend to behave more aggressively (and similar to IDH-wildtype glioblastomas) than their IDH-mutated counterparts. 24,25 Interestingly, these IDH-wildtype histologically lower grade gliomas often also display the classical genetic events seen in IDH-F I G U R E 1 Major patterns of heterogeneity in cancer. (A) Inter-tumoral heterogeneity describes molecular variation in tumors between individuals with the same tumor type (e.g., glioblastoma). (B) Multiregion sequencing studies reveal spatial intra-tumoral heterogeneity, characterized by molecular and/or phenotypic differences observed among distinct neoplastic cell populations within a single tumor. (C) Longitudinal heterogeneity, as a result of metastasis/recurrence, describes bio-variation between a primary tumor and a recurrence often following therapy.
wildtype glioblastoma including amplifications and mutations on the epidermal growth factor receptor (EGFR) gene found on chromosome 7, deletions and mutations in the genetic regions harboring the tumor suppressor PTEN (Chromosome 10) and mutations in the telomerase gene (TERT) promoter. 26 The biology of these inter-patient differences in glioblastoma are now understood to be so distinct, that the former IDH-mutated glioblastoma is now referred to as IDH-mutated, astrocytoma (WHO Grade 4). 20 Unfortunately, this biological insight has yet to produce any "IDH-specific" therapies and the organization of high-grade gliomas into IDH-wildtype and IDH-mutated exclusively serves as a prognostic biomarker of the disease. 23 Another clinically relevant form of inter-patient heterogeneity in glioblastoma is the epigenetic methylation status of the O 6 -methylguanine-DNA methyl-transferase (MGMT) gene promoter. [27][28][29] MGMT is a DNA repair enzyme that antagonizes alkylating agents and inhibits their function, including temozolomide (TMZ), the primary chemotherapy used in the treatment of glioblastoma. [27][28][29][30] Interestingly, a subset of patients show a repressive hypermethylation pattern in the MGMT promoter region resulting in an improved response to chemotherapy of 3 months. 31 This event is found in 50% of patients and its ability to signal differential response to a specific therapy makes it a critical predictive biomarker in glioma care. 28,32 Lastly, IDH-wildtype glioblastoma were also shown to exhibit inter-patient transcriptional variation. Comparison of RNA profiles from large cohorts of glioblastomas revealed three well-accepted transcriptional subgroups coined proneural, classical, and mesenchymal. 14,21 These glioblastoma subgroups show some association with specific genetic events including IDH/platelet-derived growth factor receptor A (PDGFRA) mutations, EGFR amplifications, and neurofibromin 1 (NF1) pathways mutations, respectively. 21 Their inability to predict clinically meaningful outcomes have however confined them, for the present period, largely to research settings.
As discussed below, assignment of a single transcriptional subtype to each individual tumor may perhaps be too simplistic and it is likely that these patterns are also influenced by spatial patterns of intra-tumoral heterogeneity.
2.2 | Intra-tumoral heterogeneity 2.2.1 | Hierarchical and single cell-level heterogeneity While traditional bulk and single-sample profiling approaches have revolutionized modern cancer classification frameworks, more recent stem cell, single-cell and multiregional-based profiling strategies now reveal significant intra-patient levels of heterogeneity. 33,34 In the now established cancer stem cell models, tumors contain rare populations of cancer cells that exhibit progenitor molecular programs and are involved in both cancer propagating self-renewing properties and production of more mature terminally differentiated tumor subpopulations 35 (Figure 2A). While there has been compelling experimental evidence that shows unequal tumor-forming ability of the primitive versus more mature tumor cell compartment, these concepts have also yet to translate to stem-cell specific therapies, partly due to the presumed heightened drug resistance mechanisms of cancer stem-like cells to traditional therapies. 34,36 Perhaps the most important concept provided by early studies of heterogeneity, is that not all cancer cells are equal and bulk-based profiling approaches may be insufficient to capture the true biology of specific cancer subpopulations that disproportionately contribute to recurrence.
Other recent single-cell profiling studies support an even more plastic nature of different cancer cell programs. Neoplastic cells within glioblastoma appear to dynamically transition between at least four different cellular states oligodendrocyte precursor-like (OPC), neural precursor-like (NPC), astrocytic-like (AC), and mesenchymal-like (MES) 37,38 ( Figure 2B). Some of these cell states appear to drive proliferation and expansion (e.g., OPC, NPC) while others may display more differentiated programs to promote invasion and provide resilience to hypoxia (MES) 37-39 ( Figure 2C).
While these models show parallels to the cancer stem cell models, these studies suggest that cell states may be more dynamic and plastic than originally thought. Both these types of models perhaps suggest that therapies will need to consider and address more forms of heterogeneity that may be common and unique to patients to effectively manage intra-tumoral variation. 34,37 One missing component to these single-cell studies is spatial and microenvironmental influence. Potential support of niche interactions and dependencies was evident in these pioneering studies with the MES-state showing signatures that are known to be a response to hypoxic environments. 37 This suggested that these distinct cell states may not be undergoing completely stochastic transitions but may be in fact influenced by their spatial coordinates within the tumor.

| Spatial and microenvironmental heterogeneity
Interestingly, complementary initiatives have carried out multiple spatially-separated biopsies from individual tumors and show that distinct genetic events and transcriptional subtypes of glioblastoma may exist within the same tumor and be regionally confined. 38,40 These results were recently confirmed by multibiopsy scRNAseq experiments which also suggested that even single cell profiles of tumors were influenced by sampling the tumor periphery versus core. 41 The results of these studies present a major challenge to traditional approaches of tissue profiling as they support that even with advanced single cell profiling techniques, spatial variation in molecular programs may preclude representative surveys of the overall tumor biology from a single region.
Thankfully, more precise anatomically-driven profiling studies provide evidence that many of these molecular programs are predictably influenced by microenvironmental niches. By using laser-capture microdissection to spatially recover glioblastoma tissue in areas of high tumor cellularity, infiltration, and/or tumor cells palisading around necrosis, the Ivy Glioblastoma Atlas Project (Ivy GAP) defined nichespecific transcriptomes. 38 We carried out a similar analysis using mass spectrometry (MS)-based proteomics and uncovered that there is spatial separation of tumor regions that are programmed for proliferation, migration, and hypoxic response. 8,42 Using these expression-based profiles as reference sets, niche deconvolution of the tumor microenvironment of independent datasets also supports that specific transcriptional subtypes and single cell phenotypes of glioblastoma may be driven by microenvironmental factors. 43,44 For example, the mesenchymal cell state and RNA signature is enriched in tumors regions showing higher levels of microvascular proliferation and hypoxia. 43,44 Conversely, the proneural RNA signature shows higher expression in tumor samples with a mixed tumor-brain composition and therefore may support a predominant infiltrative phenotype. 44 Interestingly, given that these transcriptional subgroups of glioblastoma are associated with specific genetic alterations, it supports the idea that molecular features of a tumor (e.g., alterations in NF1, IDH-1, PDGFRA) may be strong influences of the surrounding microenvironment (e.g., microvascular proliferation/inflammation, interaction with brain tissue). 44 These associations may provide relevant histomorphologic correlates that can guide targeted biomarker assessment for specific molecular therapies in clinical contexts where widespread molecular profiling may not yet be possible or available.
The implications of the tumor microenvironment on glioblastoma progression also extend to immune cell interactions, which have been established to play a significant role on phenotype and prognosis.
Tumor associated macrophages, specifically monocytes, make up the largest non-neoplastic cell population in the glioblastoma tumor microenvironment, making their activity a natural point of interest. 45 They are known to promote tumor growth by suppressing effector T-cell function, and stimulating tumor proliferation through secretion of pro-angiogenic and pro-mitogenic factors, such as transforming growth factor-beta (TGF-β). 45,46 Interestingly, a recent study found that macrophages have a causal role in inducing the MES-like state transition in glioblastoma cells. 47 To highlight this, the team used single-cell data, in vitro mouse models, and gliomaspheres to reveal ligand-receptor interactions via macrophage secretion of macrophagederived oncostatin M (OSM), its receptor (OSMR), and subsequent STAT3 signaling in glioblastoma cells to induce mesenchymal transition. 47 This tumorigenic activity is supported by MES-like signatures harbored by macrophages and their preferential localization to MES-like glioblastoma cells. 47 Such a finding substantiates a negative prognostic value imposed by macrophages, however, immune-tumor dynamics are highly variable. A subset of macrophages which are positive for myeloperoxidase (MPO), a neutrophil marker, have been associated with long-term survival of glioblastoma, challenging the generalization of tumor infiltrating macrophages to poor outcomes. 48 Despite these associations, the mechanisms by which microenvironmental interactions effect specific tumor subtypes remains largely clouded in complexity. Overall, the alignment of these expression programs with anatomical spatial coordinates, provides exciting support that at least some of the major patterns of tumor variation can be predicted using traditional histomorphologic guideposts. By utilizing more systematic approaches to define these dimensions of heterogeneity, we may be better able to decipher the clinical relevance and actionablity of variations in the tumor microenvironment.

| Longitudinal heterogeneity
Even with multimodal treatment that includes debulking surgery and neoadjuvant chemoradiation therapy, glioblastoma reoccurs within 6-8 months. 30,49 More recent efforts, such as the Glioma Longitudinal AnalySiS (GLASS) consortium, have thus begun the task of understanding how gliomas molecularly evolve in response to standard therapies 11,12,22,50 (Figure 2D). By collecting hundreds of patientmatched primary and recurrent glioblastomas, they recently compared genetic and transcriptional changes that occur following treatment and recurrence. Overall, while most mutations following therapy were largely stochastic and non-reoccurring, 16% of IDH-wildtype glioblastoma exhibited a hypermutation phenotype with >10 mutations per megabase of DNA following TMZ therapy. 11 Notably, these changes were comparatively less frequent than in IDH-mutant gliomas where this phenomenon occurs in 25%-47% of cases. 11 This hypermutated phenotype however did not appear to serve as a robust biomarker progression for survival, although it did correlate with slightly shorter survival intervals. 11 Radiotherapy was associated with characteristic increases in short genomic deletions (2-20 base pairs) in about 17% of cases (35% in IDH-mutated gliomas), which appeared to harbor a more aggressive phenotype in the IDH-mutated gliomas. The biological explanation for the lower frequency of these changes in IDH-wildtypes is still unclear, but may be related to the short intervals to recurrence that may limit accumulation of a high number of genetic alterations and any survival differences. 11 In addition to these genetic analyses, follow-up studies by GLASS have now looked at the transcriptional landscape in recurrent glioblastomas. By using single cell signatures to deconvolute their bulk profiling data, they found that recurrent tumors show a transcriptional pattern that is supportive of a potentially higher fraction of oligodendrocytes. 12 They interpreted this as a potential sign of higher infiltrating biology specific to recurrences and a speculated increase in synaptic transmission through ectopic expression of traditionally neuronal synaptic protein (e.g., SNAP25). 12 While they did show by immunohistochemistry that some glioma cells expressed SNAP25, perhaps some caution should be exercised from interpreting deconvoluted data from complex heterogenous tissue elements.
It is likely that such comparative analyses of the microenvironment with more careful annotation of different tissue compositions could allow for more controlled comparisons between primary and recurrent tumors.
Altogether, these extensive analyses in glioblastoma highlight that there are not only patient-to-patient molecular differences that remain unresolved, but also potential regional variation that may influence both clinical management and exploratory research into pathophysiological mechanisms of cancer biology and progression. While it may not be feasible to carry out extensive multiregion sampling on every tumor, tools to standardize and guide profiling of high-risk and high-value areas are clearly needed.

| Current tools and limitations to studying heterogeneity in cancer
There are several tools to study tumor heterogeneity, each accompanied by their own strengths and limitations. While many studies have focused on potential discordances across different molecular layers of biology, in systems with rapid cell turnover, like cancer, phenotype-level heterogeneity is ultimately reflected by variation at the genomic level that can be robustly propagated to transcriptomic, proteomic and metabolomic levels of expression ( Figure 3). As a result, it is likely that the most relevant biology will result in subpopulational changes in cytoarchitectural structure (phenotype) and can be effectively probed by a variety of existing and emerging methods (e.g., sequencing, staining, MS). As such, even in our molecular era, morphology can be a powerful "phenotypic" readout to prioritize molecular profiling to uncover the most relevant "morpho-molecular" correlates. Biological heterogeneity in glioblastoma is so immediately apparent on microscopic examination of H&E-stained tissue sections, that it was originally coined "glioblastoma multiforme". 51  As previously mentioned, we find specific transcriptional subtypes may also be associated with histomorphologic patterns. This includes the association of mesenchymal and proneural patterns with areas enriched for microvascular proliferations and brain infiltration. 43,44 Together these observations suggest that traditional histological examinations provide a powerful way to understand both inter-and intra-tumoral patterns of tumor heterogeneity.
Despite these relatively robust correlates, many other spatial transitions in patterns are harder to objectively define across observers and/or do not reliably predict a specific biological program. 63 Indeed, many studies find pathologist-to-pathologist discordances with one study showing inter-observer variability in scoring of 25% of prostate cancer specimens. 64 Similarly, even some less subjective microscopic patterns may not always reliably predict molecular changes. In a recent study we carried out exploring glioblastoma heterogeneity, a spatial change in tumor pattern was accompanied by a BRAF V600E mutation, a relatively rare and poorly characterized change. 65 Modern single-and regional-cell molecular profiling techniques now provide more objective and discovery-based approaches to study intra-tumoral heterogeneity. At the most basic level, they involve multiregional sampling of tumor tissue. Many studies, some of which are discussed above, have shown that different genetic and transcriptional events may be spatially confined to specific coordinate and microenvironments within tumors. These findings support the need to move away from using a single "representative" tumor region and transitioning toward a multiregional profiling to generate a more complete picture of the overall biology.
In support of this, a study by Liu et al. compared single-and multiregion sampling workflows. 66 Overall, they found that multisample predictions significantly outperformed individual input approaches of intra-tumoral heterogeneity, detecting more than double the cancer cell subpopulations. [66][67][68] Further studies have found that distinct subclonal cell populations may be associated with differing drug sensitivities and degrees of chemotherapy resistance, underscoring the need to compressively define this variation. 4,69 While the benefits of multiregional sampling are compelling, sampling approaches are still not well-defined and many lack informed regional guidance. More sophisticated approaches rely on radiological characteristics or microscopic features coupled with laser capture microdissection for profiling. While these have proven effective at resolving patterns of bio-variation, they F I G U R E 3 Cytoarchitectural phenotypes as readouts of molecular differences. Despite the explosion of molecular readouts (e.g., genomics, transcriptomics, proteomics), inter-dependancies across the different layers in the central dogma of biology, especially in systems with rapid cell turnover (e.g., cancer), support that the majority of relevant molecular alterations can be appreciated at multiple levels of analysis. By relying on histomorphology as a convenient final phenotypic readout of the collective propagated molecular changes, there is a scalable path to explore heterogeneity across large swaths of tissue.
are likely difficult to carry out at a scale needed for larger tumor cohorts or deploy in routine clinical workflows without some inter-observer subjectivity.
Over the past decade, there has also been an explosion of single cell and spatial profiling technologies that allow comprehensive evaluation of heterogeneity at the individual cell level. These include single cell profiling of dissociated cells, spatial transcriptomics/genomics, and imaging mass cytometry. 70  there is a need for tools to objectively define and guide regional profiling. As we elaborate in the next section, there is an exciting opportunity to integrate and leverage the relative benefits of both traditional (histomorphology) and modern (molecular analysis) tissue profiling tools using the objectivity of modern computational image analysis.
Together, this could provide a routine and systematic way to explore intra-tumoral variation in cancer.

| Supervised computational pathology approaches to resolving tumor heterogeneity
Across multiple facets of research and clinical decision making, AIbased computer vision techniques have emerged as powerful tools to objectify traditionally subjective pattern recognition tasks. 16,17,71 There has been significant enthusiasm to applying these tools to pathology where histomorphologic pattern variation can exceed the objective recognition capabilities of human observers. 72 Indeed, recent computational tools that rely on modern deep learning techniques have shown to perform at equal rates, and in some cases, even outperform manual analysis. [73][74][75] Most modern deep learning pipelines to date deploy F I G U R E 5 Overview of supervised machine learning workflow for histomorphologic pattern labelling. Supervised machine learning approaches using convolutional neural networks (CNNs) can provide objective image analysis of tumor whole slide images (WSIs) including the prediction of relevant molecular programs such as proliferation. Typically, a training dataset of WSIs is annotated by expert reviewers (pathologists) or using accompanying objective molecular features (e.g., immunohistochemical staining or transcriptional evidence of elevated proliferation-Ki-67). Clinical annotations (e.g., survival) can also be used for labeling. These annotated WSIs can then be categorized into discrete bins. For example, cases can be grouped based on exhibiting "high proliferation" (75th-100th percentile of dataset) and "low proliferation" (0th-25th percentile of dataset) indices. Intermediate cases are often excluded from these approaches. Once assembled, these images can be used for CNN training. An independent dataset is used to test the performance of the trained CNN by inputting unfamiliar WSIs. The computational output is compared to ground truth and the performance success of the CNN is benchmarked using common metrics (e.g., receiver operator characteristic curves [ROCs]). supervised approaches, where convolutional neural networks (CNNs) are trained to learn patterns of interest by providing large datasets of human-annotated images as ground truth. 76 These labeled training images allow CNNs to associate non-overlapping spatial patterns of pixels with specific labeled annotations through an iterative trial-anderror process. Once trained, these algorithms are able to quantify the learned set of data-driven deep learning features to future images and use the signatures of activation to assign predictions.
In addition to learning diagnostic labels, supervised H&E-based training workflows can be adapted to learn molecular (e.g., genomic, transcriptomic, proteomic) features that accompany many modern cancer cohorts. For these approaches, CNNs are trained using hematoxylin and eosin (H&E)-stained whole slide images (WSI) with genetically-defined ground truth labels generated from bulk tumor tissue (e.g., TCGA). [77][78][79] Once trained, these algorithms can be used to screen for variations of these molecular features both across multiple cases and within larger tissue regions of individual cases ( Figure 5).
One example of this approach is a study by Levy-Jurgensen et al., in which spatial variation in proliferation was estimated. 78 Specifically, the team trained a classifier using H&E-stained sections derived from TCGA breast and lung cancer samples. Sections were associated with high and low levels of marker of proliferation Ki-67 (MKI67) expression, which marks cells entering the mitotic cycle and consequently, cell proliferation. 78 Indeed, this workflow would allow one to survey for potential proliferative hotspots on available H&E samples without the need for comprehensive immunohistochemical assessment. 78 Similar approaches have been used to predict the presence of immune infiltrates and a small handful of specific mutations (e.g., TP53) in specific contexts. A similar approach was used by Schmauch et al., where the team trained a weakly supervised algorithm, HE2RNA with WSIs and the corresponding RNA-seq profiles of tumors from 28 different cancer types as labels. 79 Their trained model was able to carry out spatially-resolved inferences of gene expression relying only on computational analysis of phenotype. 79 These are only a selection of many recent studies that clearly demonstrate the successful application of deep learning to improve our understanding of heterogeneity in cancer. 74,[77][78][79][80][81][82][83][84][85][86][87][88][89] While impressive when these robust morpho-molecular associations are established with deep learning, it is important to note that in addition to their stochasticity, they require prior knowledge and validation around specific mutations (e.g., FGFR3 mutations in bladder cancer) and may therefore not generalize well outside of their narrow training context and thus be less suited for exploring changes that emerge downstream of these common initiating genetic events and following therapy. 90,91 These approaches will likely therefore fall short for discovery-based research applications. Similarly, while there may be positive trends at the population-level, their performance in highly heterogenous tumors may not be uniform and be dependent on the presence of other mutations and niche-specific pattern changes.
Similarly, as is common in recurrent glioblastoma, accumulation of stochastic mutations following therapy may also interfere with the overall morphology and compromise the performance of these approaches in longitudinal analysis of heterogeneity. 7,13 To overcome this form of patient-to-patient heterogeneity, one elegant study utilized deep learning to define immune and stroma-rich niches and use these features as spatial guideposts to define immune "cold" and "hot" regions of lung carcinomas. 86 Sequencing (exome and RNA-sequencing) of CNN-defined regions showed they could reliably resolve divergent mutational hotspots within these tumors. 86 Such molecularly-agnostic mapping approaches of heterogeneity are critical as they may provide more personalized and precise insights into the emergence of treatment-resistant subclones and aggressive clinical phenotypes. 67,68 Furthermore, these approaches do not rely on morpho-genomic dependencies, which may not always be shared across individual cancer.
Overall, while the ability to digitally profile histological sections for specific biological programs is attractive, they come with significant challenges when applied to practical clinical settings. Even if some molecular events were fairly robust (e.g., severe nuclear atypia with mutations in TP53/mismatch repair proteins), the prospect that such approaches can be generalized to most clinically-relevant programs is perhaps idealized. Below, we discuss a more practical marriage between histomic and molecular readouts, where the underlying premise of morphological differences predicting biological differences is maintained, but with fewer assumptions made on the specific molecular events causing their histomorphologic changes.

| Unsupervised computational pathology approaches to resolving tumor heterogeneity
While case-to-case heterogeneity may reduce the sensitivity and specificity of morpho-molecular correlates, regional changes in microscopic cellular patterns still provide reliable guideposts for biological transitions within individual tumors. Computational approaches that can define these regional morphologic switches to guide molecular profiling may serve as a cost-effective mechanism to manage individualized heterogeneity mapping. 16,18,71,92 We recently highlighted the feasibility of this approach to define morphological transitions using image feature clustering that is not reliant on a priori knowledge or hypotheses of specific morphologic patterns of interest. 93 In these workflows, CNNs optimized for diverse histologic classification tasks are used as feature extractors to compute histologic "signatures" for regional image patches derived from larger WSIs. 93,94 While sets of hundreds of features embedded within deep learning feature vectors (DLFVs) are what CNNs use to ultimately classify images into classes, the individual activation of each feature can also collectively serve as a "histomic" signature. Similar to gene and other molecular profiles, DLFVs of individual image patches can be utilized to carry out unsupervised clustering and objectively map and group phenotypic heterogeneity into regional partitions using histology 95 (Figure 6).
We recently validated this unsupervised approach in high grade gliomas by microdissecting and profiling unsupervised AI-defined partitions within large tumor samples. 65 Indeed, this approach highlighted that histomorphologic transitions could predict changes in local proliferation indices, hypoxia, and differentiation states.
Interestingly, while strong morpho-molecular correlates of these programs were not observed within this study, many of these local processes were found in multiple tumors. Importantly, this image clustering approach could be generalized to map bio-variation in other cancers from both clinical and experimental settings including mapping of genetically-distinct metastatic subclones, divergent differentiation states (adeno-vs. squamous cell carcinoma) and mixed/ collision tumors in skin, lung, and liver samples. Moreover, this tool could be extended across multiple tumor blocks providing a generalizable approach to heterogeneity mapping, independent of tumor size (Figure 7). It would be interesting to see if this approach could also be further extended longitudinally to allow exploration into how cancer programs evolve across time and following therapy. In such analyses, evolutionarily-conserved clusters may reveal the most treatment resistant regions for dedicated molecular profiling and precise targeting.
While this approach provides a dynamic solution that integrates the strengths of both histomic and molecular profiling, it does come with some limitations. By not relying on specific supervised features, it is possible that cluster patterns can change based on image patch size (e.g., clustering on low vs. high power patterns), and methods for approximating the ideal cluster diversity (e.g., Silhouette vs. Calinski approaches). These can be easily monitored on an individual slide level but get significantly more complex across multislide analysis, especially as the number of unlabeled clusters on each slide grows. We believe some of these limitations can be ultimately addressed using hybrid unsupervised workflows that use clustering to define regions of interest and then apply supervised techniques to catalog the most pertinent regions for further analysis.
F I G U R E 6 A cartoon depicting an unsupervised deep learning workflow for partitioning spatial patterns of intra-tumoral heterogeneity. The whole slide images (WSIs) of interest are inputted and partitioned into tiles before passing through a pre-trained CNN. The CNN in this unsupervised workflow serves as a morphologic feature extractor, generating a deep learning feature vector (DLFV) for each tile. These "histomic signatures" are then used to cluster and monitor spatial coordinates of tiles, ultimately identifying distinct histologic regions. These regions can be mapped back onto the input WSI to visualize spatial heterogeneity. (ii) Corresponding MIB-1 (proliferation) staining, illustrating regional variation with high proliferation rates among the cellular regions and consistently lower levels in the microcystic regions. (iii) Despite this unusual morphology, unsupervised image clustering approaches can map heterogeneous tumor regions without context specific training. (C) Pairwise Pearson correlation matrix of partitions for these three adjacent whole slide images (WSIs) allow mapping of heterogeneity across large tumor regions. Regions of microcystic tumor histology strongly correlate among the three input WSIs. Strong correlation is also observed among the highly nuclei dense cellular tumor regions of the WSIs. Such matrices can be further applied in a multitumor context, illustrating histomorphologic correlation between tumor regions in primary and recurrent tumor pairs.

| CONCLUSION
Intra-tumoral heterogeneity is an exceedingly difficult challenge to manage given the multiple layers of complexity that each biological variable contributes to the ultimate phenotype. Not only is it intrinsically driven by genetic diversity within tumor subclone, but it is now clear that it can also be heavily influenced spatially within a tumor based on local environmental signals and pressures. In our modern era where cancer patients may also experience multiple metastases and recurrences through the course of the disease, integrating a tumor's heterogeneity profile with prior and future molecular profiles further complicates analyses that rely on a single-sampled region. As our understanding of the interaction of these variables evolves, we will likely need to adopt workflows built around multiregional sampling and profiling. By coupling scalability and biological significance of transitional morphology with the objectivity and flexibility of unsupervised computer vision approaches, we believe systematic workflows can be developed to provide routine comprehensive mapping of tumor heterogeneity. While it is possible that regional variation may have some prognostic significance in patients with glioblastoma, we believe it is more likely that characterizing the regions of heterogeneity could be critical to minimizing residual disease and recurrences. By systematically evaluating heterogeneity, it is possible that the divergent biological programs that may exist in glioblastoma specimens can be comprehensively sampled and lead design of cocktail therapies aimed at targeting difficult-to-treat tumor subpopulations. While this may be challenging to carry out with every clinical presentation, it is likely that many tumor subpopulations overlap across patients (similar to the four transcriptional cell states 37 ). Therefore, we envision that a subset of subpopulations can be validated by routine immunohistochemical stains, and dedicated molecular sequencing can be reserved for only the more infrequently observed patterns. By populating this vocabulary of heterogeneity, the final treatment may ultimately be a standard cocktail for all glioblastoma patients with or without minor variations to optimize for additional subtypes or to reduce toxicity in more homogeneous tumors.
Widespread adoption of such practical approaches could offer important new insights into cancer biology and a means to better characterize and target a larger fraction of each patient's individual tumor.

AUTHOR CONTRIBUTIONS
The authors Ameesha Paliwal, Kevin Faust, Azhar Alshoumer, and Phedias Diamandis contributed to the manuscript.