A High‐Resolution Prediction Network for Predicting Intratumoral Distribution of Nanoprobes by Tumor Vascular and Nuclear Feature

In this study, the critical need for precise and accurate prediction of intra‐tumor heterogeneity related to the enhanced permeability and retention effect and spatial distribution of nanoprobes is addressed for the development of effective nanodrug delivery strategies. Current predictive models are limited in terms of resolution and accuracy, prompting the construction of a high‐resolution prediction network (HRPN) that estimates the microdistribution of quantum dots, factoring in tumor vascular and nuclear features. The HRPN algorithm is trained using 27 780 patches and validated on 4920 patches derived from 4T1 breast cancer whole‐slide images, demonstrating its reliability. The HRPN model exhibits minimal error (mean square error = 1.434, root mean square error = 1.198), satisfactory goodness of fit (R2 = 0.891), and superior image quality (peak signal‐to‐noise ratio = 44.548) when compared to a generative‐adversarial‐network‐structured model. Furthermore, the HRPN model offers improved prediction accuracy, broader prediction intervals, and reduced computational resource requirements. Consequently, the proposed model yields high‐resolution predictions that more closely resemble actual tumor microdistributions, potentially serving as a powerful analytical tool for investigating the spatial relationship between the tumor microenvironment and nanoprobes.


Introduction
[3] Despite the development of thousands of nanomedicine formulations, only a limited number have been approved for clinical use. [4,5]This disparity between preclinical success and clinical translation can be largely attributed to the heterogeneity of the EPR effect among patients. [6,7][10] Consequently, predicting the EPR effect is crucial for facilitating clinical translation and identifying patients who would benefit most from nanomedicine treatments.
EPR heterogeneity is characterized by differences in nanomedicine accumulation between tumors and organs (macrodistribution) as well as within intratumoral subregions (microdistribution). [6,11] Intratumoral nanomedicine distribution plays a critical role in determining therapeutic efficacy.[13][14] Tumor microenvironment imaging techniques, such as contrast-enhanced ultrasound, [15,16] positron emission tomography, [17,18] fluorescence imaging, [19] and mass spectrometry imaging, [20] have demonstrated potential for predicting intratumoral nanomedicine distribution.For example, Stapleton's group showed a significantly positive relationship In this study, the critical need for precise and accurate prediction of intra-tumor heterogeneity related to the enhanced permeability and retention effect and spatial distribution of nanoprobes is addressed for the development of effective nanodrug delivery strategies.Current predictive models are limited in terms of resolution and accuracy, prompting the construction of a high-resolution prediction network (HRPN) that estimates the microdistribution of quantum dots, factoring in tumor vascular and nuclear features.The HRPN algorithm is trained using 27 780 patches and validated on 4920 patches derived from 4T1 breast cancer whole-slide images, demonstrating its reliability.The HRPN model exhibits minimal error (mean square error = 1.434, root mean square error = 1.198), satisfactory goodness of fit (R 2 = 0.891), and superior image quality (peak signal-to-noise ratio = 44.548)when compared to a generative-adversarialnetwork-structured model.Furthermore, the HRPN model offers improved prediction accuracy, broader prediction intervals, and reduced computational resource requirements.Consequently, the proposed model yields high-resolution predictions that more closely resemble actual tumor microdistributions, potentially serving as a powerful analytical tool for investigating the spatial relationship between the tumor microenvironment and nanoprobes.
between the tumor perfusion imaging map and intra-tumoral retention of liposomes. [21]Despite these advances, accurately predicting intratumoral nanomedicine distribution remains challenging due to the complex spatial relationship between the tumor microenvironment and nanomedicine accumulation.
[24][25] Our group has previously demonstrated the feasibility of using deep learning models, specifically generative adversarial network for distribution analysis (GANDA), to predict intratumoral nanomedicine distribution with high accuracy. [26]29] High-resolution net (HRNet) is a recently proposed network architecture that has shown promising results in various vision tasks. [30,31]This architecture connects high-and low-resolution sub-networks in parallel, as opposed to the traditional series connection, yielding benefits in terms of universal network architecture and training efficiency.In light of this, we present a high-resolution prediction network (HRPN) based on the HRNet model architecture to conditionally generate pixel-level distributions of intratumoral quantum dots (QDs), constrained by given tumor vascular and nucleus information.Compared to the previously developed GANDA platform, HRPN demonstrates superior performance and accuracy, providing a high-throughput prediction model capable of cellular-resolution analysis with more precise spatial localization.

Results
To investigate the spatial relationship between the tumor microenvironment and nanoprobe distribution using deep learning models, two prerequisites must be satisfied: i) images encompassing the spatial information of both the tumor microenvironment and nanoprobes and ii) a sufficient volume of data to train the model.As illustrated in Figure 1A, we selected QDs as the nanoprobe model and documented their intratumoral distribution using whole-slide fluorescence imaging (WSFI).Tumor vessels and cell nuclei were stained with fluorescent dyes and recorded in other WSFI channels.WSFI images of tumors no.0-4 were decomposed into 27 780 patches to ensure adequate data for model training.The HRPN model architecture is depicted in Figure 1B.Once the HRPN was successfully trained, it was tested by predicting the QDs-channel image of tumor no. 5 based on its vessel and cell nucleus information.
Subsequently, we analyzed the heterogeneity of tumor vessels, cell nuclei, and QDs across patches.The representative whole-slide image of tumor no. 5 is shown in Figure 2A, with the background set to white to emphasize the fluorescence signal.Vascularity and cell density are known to influence intra-tumor blood flow and fluid diffusion, which are essential for nanoprobe delivery and retention.As demonstrated in Figure 2B and S3, Supporting Information, the abundance of tumor vessels, cell nuclei, and QDs varied significantly among patches, revealing the complex interaction between the tumor microenvironment and intratumoral QDs distribution.To quantify the abundance of cell nuclei, tumor vessels, and QDs, the total pixel intensity of the DAPI, Alexa-Fluor-488, and QD channels for each patch was calculated.The distribution of total intensity for cell nuclei, tumor vessels, and QDs exhibited considerable heterogeneity (Figure 2C).
To evaluate prediction accuracy at the whole-slide image level, the generated patches of HRPN and GANDA were recomposed and compared with the actual QDs-channel images.We first conducted a fivefold cross-validation, and four metrics, including mean square error (MSE), root mean square error (RMSE), goodness of fit (R 2 ), and peak signal-to-noise ratio (PSNR), were calculated to compare model performance (Figure 3A,B).The loss of HRPN during the training process was recorded in Figure S4, Supporting Information.The predictions of the HRPN model exhibited smaller MSE and RMSE values than those of the GANDA model, indicating that the HRPN prediction accuracy was higher at the whole-slide image level.Consistently, the HRPN predictions demonstrated higher R 2 and PSNR values, suggesting a greater similarity between HRPN predictions and real QDs distribution.To further investigate the generalization ability of the HRPN model, we compared its performance on tumor no. 5, which was withheld from the entire training process.The MSE, RMSE, R 2 , and PSNR values of the HRPN model in the test set were similar to those in cross-validation, indicating that the model's exceptional performance was not due to overfitting.Once again, the HRPN model outperformed the GANDA model in the test set (lower MSE and RMSE, higher R 2 , and PSNR).The actual and predicted whole-slide QDs-channel images of tumor no. 5 are shown in Figure 3C.
To compare the fine-grained local accuracy of the different models, three regions of interest (ROIs) (960 Â 864 pixels) were randomly selected from the QDs-channel image of the test tumor.Fluorescence colocalization analysis of real and predicted QDs signals was performed on these ROIs using Fiji.The actual QDs signals were colored red, the predicted signals green, and yellow indicated the overlapping area (Figure S5, Supporting Information).A linear regression analysis was carried out pixel wise on each ROI to measure prediction accuracy.The average correlation coefficient between the actual and HRPN-predicted QDs signals was 0.92 AE 0.06, higher than the correlation between the actual and GANDA-predicted QDs signals (0.75 AE 0.18).These results demonstrated that the HRPN model could more accurately predict the local details of QDs-channel images (Figure 4).
Model collapse is a common issue with GAN models, which can limit output variety and reduce the model's performance.Therefore, we compared the output intensity distribution of HRPN and GANDA.As shown in Figure 5A and S6, Supporting Information, the predicted intensity distribution of HRPN was closer to the real QDs intensity distribution of tumor no. 5 in both low-and high-intensity intervals.The results of the Anderson-Darling test also show that there is no statistical difference between the distribution of HRPN-predicted images and the real images (P = 0.25) (Figure S7, Supporting Information).In contrast, GANDA produced a narrower and lower distribution of QDs intensity, making it more prone to underestimating QDs intensity when the real intensity is high.The images were then segmented using different intensity percentiles of real images as thresholds, and the similarity of the segmentation was evaluated using the MIoU between the real and predicted QDs images.The results showed that as the threshold intensity increases, the MIoU between the real and GANDA-predicted images decreased rapidly, while the MIoU between the real and HRPN-predicted images remained higher than 0.6.This observation was consistent with the Q-Q plot, indicating that HRPN produces a wider distribution of QDs intensity than GANDA (Figure 5B,C).
In addition, the HRPN prediction network is a lightweight that requires fewer computing resources to train.The total parameters of HRPN were about half of those of the GANDA network (Table 1).Since the HRPN does not need to find the point of equilibrium between the two competing networks, the time required for training was significantly reduced.On an Nvidia RTX 3080 Ti GPU, training the HRPN model took only 5 min per epoch, compared to the 25 min required by the GANDA network for the same input data.
To investigate the impact of input channels and resolution branches on HRPN prediction accuracy, the performance of HRPN variants was compared.Figure 6A shows that the HRPN trained by DAPI þ AF488 channels patches exhibited the highest prediction accuracy, indicating that both tumor vessels and cell density information are critical for successful prediction.The results of cross-validation are also supplemented in Figure S8, Supporting Information.Notably, the HRPN trained by the AF488 channel demonstrated better performance than that trained by the DAPI channel, suggesting that vessels may dominate the intratumoral distribution of QDs. Figure 6B indicates that ablating the low-resolution branches of HRPN decreases prediction accuracy, suggesting that fusing multi-resolution information representations is beneficial for reliable QDs distribution prediction.

Discussion and Conclusion
In this study, we developed a deep learning model based on the HRNet backbone to predict the intratumoral nanoprobes distribution with high accuracy according to the tumor vessels and cell nuclei features.Compared to the model based on GAN architecture, the proposed HRPN achieved less residual error, higher accuracy, and wider prediction intervals using fewer computational resources.Through ablation and comparative studies, the effectiveness of the multi-resolution branches fusing and two-channel input was verified.
The promising predictive performance of HRPN suggests potential clinical applications.First, this technique indicates that it is feasible to predict the intratumoral distribution of nanodrugs from histopathology slices.If trained on human tumor samples, the model could predict the tumor-targeting capability of nanodrugs based on preoperative biopsy specimens, providing evidence for personalized nanodrug selection in clinics.Second, with established models predicting distributions of various nanodrugs, in silico screening could be performed to reduce the cost of drug development.
In summary, we developed a deep learning model based on HRNet to accurately predict the intratumoral distribution of nanoprobes.We envision the HRNet architecture would be a powerful tool to analyze the spatial relationship between tumor microenvironment and nanoprobes, and have great potential to facilitate personalized nanomedicine and accelerate new drug discovery in the future.

Experimental Section
Data Source and Preprocessing: The data was obtained from our previous work and preprocessed in the same way.In brief, six tumors were collected from BALB/c mice inoculated with 4T1 cells.Mice were injected with 100 μL of 20 nm PEGylated CdSe/ZnS QDs through the tail vein and executed 24 h later, 10 times the blood clearance time. [26]The tumors were freeze-embedded, sliced, and stained with 4 0 ,6-diamidino-2 0 -phenylindole (DAPI), and Alexa Fluor 488 (AF488) tagged antibody against CD31.The whole-slide fluorescence images of tumor sections were obtained using a Nikon's digital eclipse C1 microscope system and the DAPI, AF488, and QD channels were processed and measured by QuPath (version0.3.2). [32] The pixel resolution of the whole-slide images was 0.2744 Â 0.2744 μm, with additional information detailed in Table S1, Supporting Information.The images of each channel were decomposed into patches (512 Â 512 pixels), and the position index of each patch was recorded using Python (version 3.8).
Training Set, Cross-Validation, and Test Set: The six tumors were indexed from no. 0 to 5. The patches from five tumors (no.0-4) were used for model training and cross-validation first.In detail, the models were trained using DAPI-and AF488-channel patches of four tumors as data and the corresponding QD-channel patches as targets.The patches from the remaining tumor were used as the validation set.This process was repeated five times for cross-validation to tune the hyperparameters.Then, the models were trained by the patches from five tumors (no.0-4, training set).The performance of the final model was estimated by predicting the QD-channel patches of tumor no. 5 through DAPI-and AF488-channel patches.The test set (tumor no. 5) was hidden from the model tuning process to ensure the unbiased estimation of the models.
HRPN: HRPN is a supervised learning method built on the HRNet architecture.During preprocessing, instead of applying a global intensity normalization method, HRPN was trained on raw image data.The model used four resolution branches to extract features in parallel during the training process.It performed feature fusion among different scales after each residual block to achieve complete semantic information and precise location during the training process.One or more stride convolutions  (3 Â 3) were used in the conversion from high to low resolution, while one or more transposed convolutions (3 Â 3) were used in the conversion from low to high resolution.Two types of residual modules were used during model training: bottleneck ResBlock and basic ResBlock. [33]The residual connection could make more convolution and nonlinear transformation for each part of the model and prevent gradient vanishing and model degradation during the training process.The bottleneck ResBlock used 1 Â 1 convolution for up/down dimensioning, which could reduce the network parameters and deepen the network depth, making the training more accessible.Stepped convolution was used to perform downsampling.The detail of the blocks is depicted in Figure S1, Supporting Information.
During training, the mini-batch size was set as 32, and the root mean square propagation optimizer was employed with an initial learning rate at 1e À5 momentum at 0.9, and weight decay at 1e À8 .The training process was terminated within 20 epochs.Mean square error (MSE) loss ensured pixel wise the similarity between the generated and the real QD-channel patches loss patch ¼ 1 n where x i and y i referred to model generated and target pixel value of the patches.GANDA: GANDA used the same architecture as our previous report. [26]In brief, the model consisted of a generator and a discriminator.Discriminator and generator optimized their strategies to compete against each other alternatively and repetitively.The fully convolutional network (FCN)based generator learned from the spatial information of DAPI and AF488 channels and synthesized patches of QD channel.The discriminator network was trained to identify whether the generated QD patches were true or false.
GANDA batch size was set at 20, the epoch was set at 50, and the initial learning rate was set at 2e À4 using the Adam optimizer.The loss function was where l adv G ð Þ combines pixel loss and generator loss to ensure similarity and l adv D ð Þ refers to discriminator loss.The architecture of GANDA is depicted in Figure S2, Supporting Information.
GANDA's input patches were normalized by the following equation [26] input patch where max(I) is the max value of original image I and p denotes patches decomposed from this image.When merging generated patches p g to a single image, anti-normalizations were operated as follows Recomposition of Generated QD Patches: The trained models were used to synthesize patches of QD channel.These patches were recomposed to a whole-slide image by recorded indexes.
Prediction Accuracy Metrics at the Whole-Slide Image Level: Two metrics, MSE and root mean square error (RMSE), were used to measure the pixel-wise similarity between synthesized and real QD channels of each patch.The formulas were listed as follows where Y i and Ŷi were model generated and target pixel value of the image.The goodness of fit of models was measured by the coefficient of determination (R 2 ).The image quality of the synthesized images is measured by the term peak signal-to-noise ratio (PSNR).
where MAX I was the maximum possible pixel value of the image.
Prediction Accuracy Metrics at the Local-Region Level: Three ROIs (960 Â 864 pixels) were randomly selected from the real QD-channel images of tumor no. 5 (test set).The corresponding ROIs from the predicted QD-channel images from HPRN and GANDA were also selected.The predicted images of ROIs were merged with the real QD-channel images by Fiji to observe the colocalization.The correlation coefficient (Rtotal) between the predicted and real intensity of pixels was calculated.If the distribution prediction was ideal, the Rtotal should be equal to 1.
Diversity of the Model Prediction: The histograms of the predicted and real QD intensity of tumor no. 5 (test set) were calculated and compared using Q-Q plot.A threshold was applied to the synthesized and real QD channels to create masks, where pixels with intensity higher or equal to the threshold were set to white and the others were set to black.Different models were compared at different percentiles of the true distribution.The intersection of masks between synthesized and real QD-channel images of tumor no. 5 was measured by mean intersection over union (MIoU) where TP is the number of pixels that are white in the masks of synthesized and real images, FN is the number of pixels that are white only in the mask of real images, and FP is the number of pixels that are white only in the mask of synthesized images.Ablation Studies: Source Channel: Ablated HRPN models were trained by patches of DAPI channel or AF488 channel of (tumor no.0-4) only.The performance of these models was compared with the full HRNet model (trained by patches of DAPI and AF488 channels).
Resolution Branches: The predictive value of different resolution branches was also verified by ablation experiments.HRPN models containing one and two resolution branches (1Â and 2Â HRNet) were constructed and compared with the full HRNet model (4Â).Model training and optimization were performed on an Nvidia RTX 3080 Ti GPU.

Figure 1 .
Figure 1.Using high-resolution prediction network (HRPN) predicting the intratumoral distribution of nanoprobes.A) The workflow of the training and predicting process.The distribution of tumor cell nuclei, vessels, and quantum dots (QDs) were represented by blue, green, and red fluorescence in whole-slide fluorescence imaging.Mean square error (MSE) was used to measure the similarity between the predicted and real QD distribution.B) The architecture of HRPN.

Figure 3 .
Figure 3.Comparison of model prediction accuracy.A) Test and cross-validation schemes.B) Performance of different models on validation and test sets (MSE, root mean square error [RMSE], R 2 , peak signal-to-noise ratio [PSNR]).Predicted and real images were compared on the whole-slide view.Each point in the box plot indicates the onefold result in fivefold cross-validation.C) Model-predicted and real QDs distribution images of the test tumor.

Figure 4 .
Figure 4. Colocalization analysis of fluorescence images.The size of ROIs was 960 Â 864 pixels.The scale bar represents 30 μm.On the left is the overlay of real and predicted images.Red is the real distribution of QDs, green is the predicted distribution, and yellow is the accurately predicted area.The right scatter plot shows the correlation coefficient between the real and predicted QDs distribution images.The X axis denotes real image intensity, while the Y axis shows generated image intensity.Rtotal was calculated by Fiji.

Figure 5 .
Figure 5.Comparison of model prediction diversity.A) Histograms of the intensity distribution of real and model-predicted images.B) Q-Q plot was fitted based on the intensity percentile of real and predicted images.C) Mean intersection over union curves under different thresholds.The threshold values were the intensity percentile of the real image.

Figure 6 .
Figure 6.Ablation studies of input channels and resolution branches.A) Results of varying the input channels of HRPN.DAPI indicates only DAPIchannel patches, AF488 indicates only AF488 channel patches, and DAPI þ AF488 indicates that both channel patches were input to the model.B) Results of varying the resolution branches of HRPN fusion.1, 2, and 4 correspond to models fusing different resolution branches.The provided results were observed on the test tumor.

Table 1 .
Params of different models.