HRMAn 2.0: Next‐generation artificial intelligence–driven analysis for broad host–pathogen interactions

To study the dynamics of infection processes, it is common to manually enumerate imaging‐based infection assays. However, manual counting of events from imaging data is biased, error‐prone and a laborious task. We recently presented HRMAn (Host Response to Microbe Analysis), an automated image analysis program using state‐of‐the‐art machine learning and artificial intelligence algorithms to analyse pathogen growth and host defence behaviour. With HRMAn, we can quantify intracellular infection by pathogens such as Toxoplasma gondii and Salmonella in a variety of cell types in an unbiased and highly reproducible manner, measuring multiple parameters including pathogen growth, pathogen killing and activation of host cell defences. Since HRMAn is based on the KNIME Analytics platform, it can easily be adapted to work with other pathogens and produce more readouts from quantitative imaging data. Here we showcase improvements to HRMAn resulting in the release of HRMAn 2.0 and new applications of HRMAn 2.0 for the analysis of host–pathogen interactions using the established pathogen T. gondii and further extend it for use with the bacterial pathogen Chlamydia trachomatis and the fungal pathogen Cryptococcus neoformans.


| INTRODUCTION
Pathogen infection of cells can be analysed by fluorescence microscopy and relies on accurate quantification of observed phenotypes to reveal magnitudes of host and pathogen parameters and the kinetics of their interaction. Manual scoring of infection processes from microscopy images is laborious, biased and prone to errors. Often, it restricts the number of samples and replicates that are included in an experiment (Meijering, Carpenter, Peng, Hamprecht, & Olivo-Marin, 2016).
High-throughput image acquisition with automated high-content imaging platforms opens the possibility of studying host-pathogen interactions on a large scale (Pegoraro & Misteli, 2017), for instance in combination with genome-wide depletion screens (Brodin & Christophe, 2011;Usaj et al., 2016). However, our ability to produce huge imaging datasets is curtailed by our ability to analyse them efficiently and accurately (Meijering et al., 2016).
Innovation in automated image analysis has relied on either opensource (Carpenter et al., 2006;Smith et al., 2018;Stöter et al., 2013) or proprietary (e.g., Perkin Elmer Harmony) software for analysis. Typically, images are analysed by classical fixed-parameter image Daniel Fisch and Robert Evans contributed equally to this work. segmentation algorithms (Kühbacher et al., 2015;Matula et al., 2009;Osaka et al., 2012). However, data generated by these classical approaches are usually restricted to quantifying the pathogen growth on a single-cell level. Extracting information beyond this uppermost layer of host-pathogen interactions, for example, analysing the redistribution of proteins upon infection, is difficult due to the inherent heterogeneity of imaging datasets and cellular responses. Furthermore, classic image segmentation approaches are dataset specific and require manually altering the segmentation parameters (i.e., updating the code/parameters of the program) to produce reliable data, if for instance, the cell type or nature and/or intensity of stainings change between experiments.
To overcome these limitations and enable infection researchers to quantify their imaging data without the need for coding, we created HRMAn (Host Response to Microbe Analysis) (Fisch et al., 2019).
HRMAn is a high-throughput, high-content, single-cell image analysis pipeline that incorporates machine learning (ML) and an ensemble of deep convolutional neural network (CNN) for infection analysis (www. hrman.org). To assure its broad usability and future software support, HRMAn is based on the data handling environment KNIME Analytics platform (Berthold et al., 2008). The analysis relies on training of ML algorithms and deep CNNs that can be tailored to individual researchers' needs and experimental questions. The trained CNNs contained within the analysis pipeline are used for image classification, phenotype quantification and for instance segmentation, a hybrid of semantic segmentation and object detection. CNNs work with the image itself and make use of complex patterns (e.g., shapes) within the dataset to learn phenotypes which they derive in a supervised fashion from expert-labelled data (Krizhevsky, Sutskever, & Hinton, 2012). Deep CNNs consist of several layers, mimicking the cortex of a brain. These can comprise convolution, normalisation, pooling and fully connected layers (Nielsen, 2015) which convolve features, normalise for local contrast enhancement (Krizhevsky et al., 2012) or down sample feature maps to increase the sensitivity of the network. Combining several of these layers all looking at the output maps of the previous layer creates a deep CNN that step-by-step reduces the complexity of the input, size of the tensor and extracts key features and patterns. In deep CNNs for classification, the final layers are usually fully connected layers, which produce the output (for an excellent overview please refer to LeCun, Bengio, & Hinton, 2015). Use of AI for image analysis allows for increased flexibility and versatility of HRMAn 2.0, without requiring the user to update the code and analysis parameters for every dataset (Godinez, Hossain, Lazic, Davies, & Zhang, 2017;Kraus et al., 2017;Kraus, Ba, & Frey, 2016).
HRMAn was designed for quantification of high-content imaging experiments and has direct compatibility with datasets from 96-well or 384-well cell culture plates. Prior to analysis, stained specimens (infected host cells) are imaged on a fluorescence microscope. Ideally, the use of automated high-throughput imaging platforms allows for rapid acquisition of images from multi-well plates, but standard fluorescence imagers with a programmable stage can also be used for image acquisition (reviewed in Fisch, Yakimovich, Clough, Mercer, & Frickel, 2020).
Depending on the type of experiment, HRMAn allows the user to choose from a range of analysis methods (Fisch et al., 2019). Simple infection analysis only assesses host cell and pathogen numbers as well as replication of the pathogens. This fast analysis provides the same quantification as would have classically been obtained by manual counting, but in a matter of minutes, for thousands of images, rather than hours/days. Further insight into host-pathogen interactions can be gained by studying the changing spatial distribution of host and pathogen proteins, but quantifying this manually or by using classical image analysis approaches is close to impossible. HRMAn therefore relies on ML and deep CNNs to classify and quantify localization of proteins on a single-cell level. Readouts from this second stage of analysis represent one of the more advanced analysis methods offered by HRMAn (Fisch et al., 2019).
Two years ago, we presented the original HRMAn (Fisch et al., 2019). Continued development of HRMAn now allows us to release HRMAn 2.0 an even more powerful ensemble of ML and artificial intelligence algorithms for image analysis/quantification. In this work, we present the major improvements and additions to the original HRMAn and illustrate how HRMAn 2.0 can be used in new ways to dissect the interaction between host cells and intracellular pathogens. We also present methods of ongoing data collection which involve crowdsourcing classifications from non-expert volunteers.
Volunteer consensus data from the Zooniverse platform (https:// www.zooniverse.org) will be used in conjunction with pooled training data generated by experts, to create unbiased CNN training datasets for the continued development of HRMAn.

| Improved input/output
HRMAn was originally designed to reliably and automatically quantify host-pathogen interactions on a large scale. Now, HRMAn 2.0 has a more streamlined user interface that guides the users through the setup and additionally performs quality control on input images before analysis commences (Figure 1a). The execution process (order, timings, memory management) of HRMAn 2.0 has been improved leading

TAKE AWAY
• HRMAn 2.0 allows host-pathogen interaction analysis from imaging experiments • HRMAn 2.0 extends the analysis into the 3D-space • HRMAn 2.0 can be adapted for analysis of any intracellular pathogen • HRMAn 2.0 uses AI for focus detection, segmentation and phenotype quantification to overall shorter analysis times and more stability of the program.
Lastly, many parameter-extraction/-adjustment processes have been automated and user input/output has been simplified and made more graphical, for example, by including a user interface for defining the assay layout ( Figure 1b). Therefore, users now exclusively need to direct the program to their image input directory, select their analysis method and pathogen ( Figure 1c) and the rest will be managed automatically. For an overview of all changes and improvements see Table 1.
In summary, the updated input/output system of HRMAn 2.0 makes it easier for the user to follow the analysis and assess whether the program is working accurately. The updates also improve performance and facilitate ease-of-use while maintaining accuracy and unbiased analysis capabilities. Since image analysis is computationally expensive, please refer to Table 2 for minimal and recommended system requirements to run HRMAn 2.0 with and without GPU acceleration and for an overview of expected analysis times for datasets of different sizes.

| New image pre-processing
Image data pre-processing is important for any kind of large-scale imaging-based experiment (Bray, Fraser, Hasaka, & Carpenter, 2012).
The original HRMAn already performed single-channel illumination corrections, and this step has not been changed as it was performing well (Fisch et al., 2019). Briefly, HRMAn 2.0 performs channel-wise, retrospective illumination correction, by creating a bright image without objects using a low-pass filter with a large kernel (Gaussian) and subtracting this as background. Additionally, HRMAn 2.0 now prescreens images for contamination/imaging artefacts and for out-offocus images. These images need to be removed from the analysis since they can, in the worst case, affect the quality of an experiment/ screen overall.
In order to do this, we implemented calculation of the percent maximum metric for each image, as suggested by Bray et al. (2012), into HRMAn 2.0 to exclude images with saturation artefacts. Furthermore, we added a two-tier focus quality detection strategy to HRMAn 2.0. In tier one, HRMAn 2.0 uses an artificial intelligence approach as has been proposed and spearheaded by Yang et al. (2018) to judge image quality. For tier one analysis, we trained a deep neural network that works on 300Â300 px tiles of the input image of the cell nuclei and bins them into classes between 0 (in-focus) and 10 (out-of-focus) ( Figure 2a). Finally, for each individual image, the overall focus class is calculated as a mean of the respective image tiles (Figure 2a). We trained the neural network with more than 500,000 images that were either in focus or artificially defocused as described by Yang et al. (2018) and furthermore injected with Poisson noise to allow for training of a more generalised model (Figure 2b). The final model obtained after training was >95% accurate in classifying previously unseen images (Figure 2b). The CNN-based quality assessment is complemented in tier 2 by calculation of the power Log-Log Slope (PLLS) which measures the slope of the power spectrum density of intensities within an image (Bray et al., 2012). Combination of two independent focus detection strategies now allows HRMAn to precisely pre-filter images prior to analysis (Bray et al., 2012;Groen, Young, & Ligthart, 1985;Sun, Duthaler, & Nelson, 2004;Yang et al., 2018). To do so, we have pre-configured HRMAn 2.0 to select images with a focus that is deemed acceptable to produce reliable results in the downstream analysis steps. However, these thresholds can be changed by the user, if a more stringent filtering is required or vice versa.
Indeed, combining the two tiers into one focus-quality assessment strategy allowed us to accurately filter images in a larger experiment that used images from 360 positions each with 16 different focus positions (Figure 2c). Since HRMAn 2.0 was also set up to perform 3D analysis, the two-tiered focus determination method allows for selection of the most in-focus planes in a series of z-stacks ( Figure 2d). We illustrated this in an experiment that on purpose had a multi-well plate mounted in a high-content imager in a slight tilt. Here, HRMAn 2.0, depending on the analysis type, was able to reject individual fields, as would be the case in a 2D-experiment/analysis, or correctly pick the most in focus images in the z-stack series when run in 3D analysis mode ( Figure 2e). We designed HRMAn 2.0 to automatically connect to a remote CNN model repository to obtain the trained focus-detection CNN without the user having to provide it manually. Furthermore, HRMAn 2.0 reports which fields have not passed the quality control (and have therefore been excluded) to the user and appends image quality notes into the results file for the user to inspect subsets of their dataset that have been flagged and excluded from the analysis.

| Improved object detection by instance segmentation using trained CNNs
The original version of HRMAn relied on classical, thresholding-based segmentation for object detection (e.g., cells, nuclei and pathogens) before performing quantification of the images and the hostpathogen interaction as well as host protein to pathogen recruitment analysis using a CNN. While this was reliable for many different imaging datasets, we added options to perform object detection using F I G U R E 1 The HRMAn 2.0 graphical user interface and analysis capabilities. The graphical user interface of HRMAn 2.0 (a) has been streamlined to better guide the user through the analysis steps. Furthermore, entering the parameters of an analysis has been simplified by addition of interactive menus, including list selections of analysis types and the graphical representation of multi-well plates to define the assay layout (b). HRMAn 2.0 has different analysis methods, and the table in (c) provides an overview of settings, calculated readouts and pathogens for which the analysis has already been validated for artificial intelligence in HRMAn 2.0, greatly increasing the program's versatility (Caicedo et al., 2019). To do so, we have implemented an adapted version of StarDist for nuclei segmentation (Schmidt, Weigert, Broaddus, & Myers, 2018) and a full version of Cellpose for (label-free) cell segmentation ( Figure S1) (Stringer, Wang, Michaelos, & Pachitariu, 2021).
In brief, StarDist combines a CNN with non-maxima-suppression (NMS) to segment nuclei from fluorescence images . Unfortunately, we were not able to perform NMS within KNIME while maintaining ease of use. However, we setup HRMAn 2.0 to use the probability maps created by the StarDist CNN to enhance nuclei detection ( Figure S1a). With this approach, we could greatly improve segmentation by prevention of over-segmentation (as is sometimes, the case with water-shedding segmentation), achieve better separation of overlapping or touching nuclei, suppress staining artefacts or correct for uneven fluorescence. In very difficult to segment images, we can still observe segmentation artefacts, but in these cases, we would recommend optimising the experimental and imaging conditions (e.g., cell densities). Again, HRMAn 2.0 can retrieve the trained CNN from a central repository autonomously without the user having to manually load the file.
Similar to StarDist, Cellpose is a generalised cell segmentation method relying on a trained CNN (Stringer et al., 2021). We fully implemented Cellpose within HRMAn 2.0, and it can be used to segment cells of any kind and/or fluorescence stain ( Figure S1b). As a generalised model, Cellpose was also able to accurately detect label-T A B L E 1 Improvements of HRMAn 2.0 as compared to the previous version HRMAN 1.0 Upgrades for HRMAN 2.0

Manual input of image information to setup analysis parameters
Image information is detected automatically. Information that cannot be detected is entered by user via pop-up dialogue boxes. (Magnification, pixel size, number and order of channels, pathogen type, analysis type and segmentation methods) Assay layout uploaded as a counterintuitive spreadsheet file Clear graphical user interface that allows users to input assay layouts (96-well and 384-well plates) with better overview. Additionally, users can create customisable layouts for images from coverslips No bulk input option for large-scale analysis Bulk upload option from template to circumvent manual parameter input for faster, large-scale analysis Simple inspection of segmentations as quality control Streamlined layout of the analysis pipeline to encourage quality inspection by the user and better display of segmentations in an interactive view showing overlays of original data and detected objects No memory management, causing instability on less powerful computer systems Chunking of data and memory/temporary file management for increased stability of the analysis pipeline No updates on progress of the analysis Visual messages in the KNIME console inform users on the progress of the analysis. Acoustic signals when analysis steps are done.
Manual upload of a reference dataset for ML prediction of pathogen replication Automatic choice of reference dataset for ML prediction of pathogen replication Images to be named in a plate format, required preformatting with separate workflow All-in-one pipeline that automatically arranges images and requires no pre-formatting of images No support for 3D datasets Automated z-projections if 2D analysis is to be performed, option for full 3D analysis Manual removal of corrupted files prior to executing the analysis pipeline Automated removal of corrupted images, replacement by empty fields of view using same data structure Unchanged input data folder Automated archiving of input images for long-term storage of raw data following the analysis Manual creation of several empty output files to save data following the analysis Automated creation of a single spreadsheet output file which contains all calculated results

Storage of only the grouped final results
Storage of all calculated results (assay layout, analysis parameters, data quality, data for each individual cell and pathogen, grouped data for each field, well/coverslip and the sample groups overall) No information on quality of ML/AI performance Report of confidence values for prediction of pathogen replication and for protein recruitment analysis with CNN (Allows judging of performance by the user) No reporting of label IDs Reporting of unique labels IDs that allows tracing each label back to the raw data No data inspection capability within HRMAn Interactive data dashboard for fast inspection of key data and statistics Manual upload of trained CNN models Automated download of latest models from central repository

No news on new updates Automated messages inform users about availability of newer versions
Simple CUDA GPU acceleration Enhanced GPU-acceleration for maximum performance and stability when using GPU for calculations free cells in brightfield images ( Figure S1b). We found that this segmentation was versatile enough to segment brightfield images of the yeast Cryptococcus neoformans not just mammalian cells for which the algorithm is mainly used ( Figure S1b). Impressively, by using Cellpose, HRMAn 2.0 accurately separated densely packed cells, like a confluent monolayer of human foreskin fibroblasts, a common model host cell line in the field of host Toxoplasma interaction ( Figure S1b).
Since running Cellpose is computationally expensive, we recommend using GPU acceleration.

For both the StarDist-enhanced nuclei segmentation and
Cellpose-driven cell segmentation, we configured HRMAn 2.0 to allow for the user to choose between the classical (faster) algorithms and these more sophisticated methods, depending on their requirements and capabilities of their computer. In summary, HRMAn 2.0 now offers a full ensemble of state-of-the-art CNN-based instance segmentation methods. These can be used without the user having to write a single line of code.

| 3D analysis
Given the improved segmentation of cells using Cellpose and the option to run it in 3D for z-stacks (Stringer et al., 2021), we were now able to add 3D analysis capability to HRMAn 2.0 ( Figure S2). This allows for analysis of imaging screens that use 3D z-stacks for each position. Cellpose is used to detect cells and their connecting labels in all three dimensions, and, at the same time, pathogen segmentation has been updated to allow for detection of corresponding labels in a 3D-stack ( Figure S2a). For analysis of this type, users of HRMAn 2.0 need to ensure that their imaging setup, especially the z-step size, matches the capabilities of their imaging system and the specificities of the fluorescent stains. In this way, we could use HRMAn 2.0 to measure pathogen vacuole volumes instead of areas and therefore improve the sensitivity for pathogen replication and growth quantification. Similarly, we designed HRMAn 2.0 to classify protein recruitment to pathogens independently for each z-plane that the pathogen vacuole was detected in. This improved specificity of the recruitment classification ( Figure S2a). To illustrate this new capability of HRMAn 2.0, we used the program to segment a confluent epithelium of A549 cells, which yielded impressive results and shows how HRMAn 2.0 might be used in the future ( Figure S2b).

| Improved artificial intelligence recruitment classification
The original HRMAn's greatest innovation was use of deep CNNs for classification of protein recruitment to pathogen vacuoles (Fisch et al., 2019). The original program used the DL4J deep learning framework, which worked reliably, but is slightly outdated now. We therefore replaced the DL4J framework with Keras (Chollet, 2015) using a TensorFlow backend (Abadi et al., 2016), which should future-proof System requirements MacOS 10.12.6/ Windows 7 (or newer) QuadCore CPU >2.0 GHz (e.g., Intel ® Core™ i7-7700 or AMD Ryzen™ 3 2200) 8 Gb RAM (e.g., 1600 MHz DDR3) 2 Gb of hard drive storage for KNIME + HRMAn 2.0 and additional storage of about 5 times the size of the image dataset (e.g., 50 Gb for 10 Gb of imaging raw data) MacOS 10.14.6/Windows 10 Multicore CPU >4.0 GHz (e.g., Intel ® Core™ i9-10900 or AMD Ryzen™ 9 5950X) 32 Gb RAM (e.g., 3200 MHz DDR5) 2 Gb of high-speed PCIe 4.0 SSD storage for KNIME + HRMAn 2.0 and additional storage of about 5 times the size of the image dataset (e.g., 50 Gb for 10 Gb of imaging raw data) confidence to the user, analysis can be repeated with a different model, should the user choose to. We also managed to train a generalised recruitment classification model using four different pathogens (Toxoplasma gondii, Salmonella typhimurium, Chlamydia trachomatis, Cryptococcus neoformans), more than 10 different fluorescent stains and images from five different automated and non-automated and confocal/widefield microscopes which achieved an overall accuracy of 95.80% ( Figure S4c). We furthermore added a second independent workflow to the HRMAn 2.0 analysis suite (Figure 3e). This workflow allows users to create their own annotated datasets and use them for training of custom CNNs for their fluorescent stains/protein recruiting to pathogens inside any host cell (Figure 3e). This all-in-one workflow starts by providing regular fluorescence images, which will then be processed as in (c) Graphs illustrating performance of the trained CNN and calculation of the power log-log slope (PLLS) of the pixel intensity spectrum to classify focus quality. n = 360 images for each focus position (27 positions ranging from À13 to +13 μm). (d) Graphical representation of HRMAn 2.0's focus determination process for images from a z-stack. Two example images, one in-focus (magenta) and one out-of-focus (yellow) highlighted, to illustrate the CNN-based focus classification and determination of the PLLS to then reach the final judgement of focus quality. The AI-based focus-quality judgement (middle) illustrates the individual classes of image tiles as determined by the trained neural network. Graph on the right shows the calculation of the PLLS of the two example images from the z-stack. Scale bar: 50 μm. (e) Graphical illustration of focus determination for a whole multi-well plate. Arrow indicates well/field order during acquisition. HRMAn 2.0 judges focus quality for the whole well and for each image individually (here from z-stacks), as indicated using the colour scale on the plate and the heatmap. Depending on the following analysis type, images are either rejected from a 2D analysis, if they are out-of-focus, or the most in-focus planes are selected for a 3D analysis

| New example analysis applications
HRMAn was able to accurately quantify host-pathogen interactions on a high-throughput scale and computed more than 15 comprehensive readouts. We have illustrated this before (Fisch et al., 2019), and HRMAn 2.0 produces the same readouts (Figure 1c), although faster and even more precisely. We therefore want to showcase more possible applications of HRMAn 2.0 and also demonstrate its usability for pathogens other than T. gondii or Salmonella.
HUVECs are known to kill Tg by directly acidifying the vacuoles (Clough et al., 2016). HRMAn 2.0 was therefore deployed to classify vacuoles based on LysoTracker signal (Figure 4a Another pathogen commonly infecting macrophages is C. neoformans (Srikanta, Santiago-Tirado, & Doering, 2014). This fungus grows as a unicellular yeast and replicates within cells by budding (May, Stone, Wiesner, Bicanic, & Nielsen, 2016;Rudman, Evans, & Johnston, 2019). A number of GFP-tagged wildtype (Bielska et al., 2018;Voelz, Johnston, Rutherford, & May, 2010) and virulence factor knockout  Cryptococcus strains are available which make high content imaging possible. Alternatively, fungi can also readily be stained with the fluorescent dye calcofluor white. Retraining HRMAn to recognise intracellular Cryptococcus and using the decision tree ML algorithm to classify budding, and thus replicating, fungi (they appeared distinctively larger and with lower circularity), revealed that IFNγ-treated human THP-1 cells were able to restrict the growth of this pathogen (Figure 4e).
With the exception of viruses, we demonstrated that HRMAn 2.0 was able to work with any kind of intracellular pathogen, irrespective of being bacteria, protozoans or fungi. Importantly, re-training HRMAn only took little time, but resulted in a robust analysis pipeline that can be used for many future experiments. All pre-set filters for the different pathogens are available to users of HRMAn 2.0.

| DISCUSSION
Advances in computational hardware and software developments have made deep CNNs a powerful image analysis tool (LeCun, Bottou, Bengio, & Haffner, 1998;Russakovsky et al., 2015). CNNs are able to generalise patterns independent of minor phenotypic differences and allow for a more robust classification of images or parts thereof (LeCun et al., 2015). Automated image analysis programs, some of which incorporate machine learning elements, have been developed and are successfully used for classical image segmentation (Osaka et al., 2012), but when presented with the problem of classifying host protein recruitment to a pathogen, inaccurate classical image segmentation could lead to erroneous results (Pärnamaa & Parts, 2017).
HRMAn 2.0 circumvents these problems and delivers user-defined automated and unbiased enumeration.
In recent years, many programs have been developed that make use of computer vision advances to drive scientific progress in basic research (Eulenberg et al., 2017;Pärnamaa & Parts, 2017) and in application in the clinic (Cireşan, Giusti, Gambardella, & Schmidhuber, 2013;Esteva et al., 2017;Litjens et al., 2017;Roth et al., 2018). For microscopy image analysis, these usually are focused on one step, for example, image reconstruction from super-resolution imaging (Ouyang, Aristov, Lelek, Hao, & Zimmer, 2018;Weigert et al., 2018), segmentation of nuclei or cells (Ronneberger, Fischer, & Brox, 2015;Schmidt et al., 2018;Stringer et al., 2021) or classification of image parts (Falk et al., 2019). While we did not invent novel ways of analysing images with CNNs, HRMAn 2.0 delivers a unique ensemble of pre-trained networks, combining the power of these individual solutions. Following the initial publication of HRMAn and application for host Toxoplasma interaction, similar approaches have been made for quantification of host Plasmodium interaction (Davidson et al., 2021;Hung et al., 2020). With the release of HRMAn 2.0, we now deliver a program with broad applicability to host-pathogen interactions in general. We designed HRMAn 2.0 with a focus on intracellular pathogens and the interaction with their host cell. Given HRMAn's flexibility, extracellular pathogens prior to entry into the host cell can be analysed, too. Experimental setup and appropriate readouts can be tailored to pathogens outside host cells. For instance, staining of extracellular pathogens prior to permeabilisation of imaging specimens could be used to assess the invasion rate of the pathogen. However, we need to point out that although HRMAn 2.0 is a versatile program, it is focused on host-pathogen interaction analysis and is therefore not the "swiss army knife" of general image analysis or for high-throughput imaging, where programs like ImageJ/ FIJI (Schindelin et al., 2012) and CellProfiler (Carpenter et al., 2006) have their strengths, respectively.
The combination of automated image segmentation, decision tree ML and another deep CNN for quantification makes HRMAn 2.0 a powerful and user-friendly program for analysis of host-pathogen interaction at the single-cell level. HRMAn 2.0 is capable of detecting and quantifying multiple pathogen and host parameters, as illustrated with several pathogens of varying sizes and growth morphologies.
Designed for biologists, HRMAn 2.0 requires no coding or specialised computer science knowledge. The modular architecture and graphical representation of the analysis pipeline, provided by the use of KNIME Analytics platform (Berthold et al., 2008), allows users to tailor experimental outputs to their own datasets and questions. Thus, HRMAn 2.0 can be rapidly applied to many similar large-scale, imaging experiments. Similarly, HRMAn 2.0 can also be used to answer questions that do not directly derive from host-pathogen interactions, but from the pathogen's biology itself. As mentioned above, with elegant staining strategies and experiment design, HRMAn 2.0's analysis capabilities can enable users to assess invasion rates and extracellular behaviour of pathogens. Furthermore, using higher-resolution imaging, HRMAn 2.0 could be used to quantify the morphology of pathogens within vacuoles or for example chronic forms of pathogens such as the Toxoplasma bradyzoite cyst, which are planned for the next updates of HRMAn. As such, HRMAn 2.0 will allow a broad range of researchers to extend into the realm of high-throughput single-cell analysis of host-pathogen interaction.
HRMAn 2.0 includes performance improvements and provides users with an even more precise image analysis tools. One major new improvement was the extension of HRMAn 2.0 from a simple 2D, or z-projected analysis to the three-dimensional space. Measuring volumes instead of areas is especially useful for quantification of pathogen growth and the prediction of replication using decision tree ML. High-throughput 3D analysis therefore promises to reveal even more subtle phenotypes that would have been missed by manual enumeration of microscopy slides. Similarly, HRMAn will also be updated to work with time-resolved image sequences from live-imaging experiments in the future. Tracking intracellular pathogens and the respective host cells over time would enable grouping cells and/or pathogens into subsets based on their fate, that is, growth, persistence or killing (Fazeli et al., 2020). Making use of many excellent algorithms for tracking of motile and immotile cells, for example, TrackMate (Tinevez et al., 2017), will be useful for this. The culmination of these two analysis types would be time-resolved 3D image analysis, which at present is restricted mainly by computational limitations and dataset sizes.
Other improvements of HRMAn 2.0 were derived from the rap-  ) and (e) from n = 3 independent experiments ± SEM and in (d) from n = 1 proof-of-principle experiment. **p ≤ .01; ****p ≤ .0001 in (a) from two-way ANOVA following adjustment for multiple comparisons and in (b) from unpaired t test; n.s., not significant the DeepLearning4J library, but newer libraries like Keras and Ten-sorFlow (Abadi et al., 2016;Chollet, 2015) now deliver better deep CNNs, that train faster and classify images with ever increasing accuracy (Nichols, Herbert Chan, & Baker, 2019). HRMAn 2.0, at this time, is the only program for analysis of host-pathogen interactions that makes use of powerful trained CNNs in every step of the analysis (image quality assessment, object detection, image classification and phenotype quantification). While for each of these problems, CNNbased algorithms have been created as a solution individually (Krizhevsky et al., 2012;Pärnamaa & Parts, 2017;Schmidt et al., 2018;Stringer et al., 2021;Yang et al., 2018), HRMAn 2.0 combines them into a single turnkey analysis pipeline. Alongside classical image analysis stemming from signal theory, a custom CNN is provided for focus quality control. Another application of deep CNNs within HRMAn 2.0 is instance segmentation. This computer vision task involves prediction of object instances and their per-pixel masks.
Both StarDist and Cellpose now allow high-precision detection of nuclei and host cells, respectively Stringer et al., 2021). Replacing the classic segmentation approach with deep CNNs made segmentation more reliable and robust. This will help HRMAn 2.0 to cope with a wide array of sample preparations and imaging protocols without requiring additional user intervention.
Importantly, HRMAn 2.0, as a turnkey analysis pipeline, facilitates use of these elegant methods, by not requiring the user to adapt them for their specific dataset. Lastly, the previously implemented CNN forprotein-recruitment analysis (Fisch et al., 2019) has been updated and classification accuracy greatly improved using a newer CNN architecture and quintuplicate expansion of the datasets. We also established the Zooniverse project "Microbe Watch" with which we are gathering large numbers of consensus annotations to train CNNs for protein recruitment prediction that are not biased by annotation from a single user. This annotation bias is a known problem for training of CNNs (Pelt, 2020) and by gathering millions of annotations for thousands of images the next CNNs for HRMAn 2.0 should deliver no or little bias.

C. trachomatis
LGV-L2 was propagated in Vero monkey kidney cells.
All cells were regularly tested for mycoplasma contamination, cultured without addition of antibiotics and grown at 37 C in 5% CO 2 atmosphere. Cells were stimulated for 16 hr prior to infection in complete medium at 37 C with addition of 50 IU/ml human IFNγ (285-IF, R&D Systems).

| Toxoplasma infection
Parasites were passaged the day before infection. Tachyzoites were harvested from HFFs by scraping and syringe lysis through a 25 G needle. The obtained suspension was cleared by centrifugation at 50g for 5 min and the parasites pelleted by subsequent centrifugation of the supernatant at 550g for 7 min. Tg-containing pellets were washed with complete medium once and finally re-suspended in fresh medium. Viable parasites were counted with trypan blue and used for infection at a multiplicity of infection (MOI) of 1. Infection was synchronised by centrifugation at 500g for 5 min. Two hours after infection, extracellular Tg were removed by washing with PBS three times.

| High-throughput imaging
For simple infection analysis, 50,000 THP-1 cells were seeded per well of a 96-well imaging plate, differentiated and treated as described above. HFFs were harvested by washing a confluent monolayer with PBS and subsequent lifting of the cells with 0.05% trypsin-EDTA (15400054, Gibco). Cells were centrifuged at 250g for 5 min, resuspended in fresh medium and 20,000 HFFs per well were seeded the day before IFNγ treatment. Similarly, HUVECs were harvested, and 15,000 cells per well were seeded in complete medium the day before IFNγ treatment. A549s and HeLa cells were harvested in the same way, and 8,000 cells per well were seeded the morning before IFNγ treatment. All cells were seeded on 1% (w/v) porcine gelatin (G1890, Sigma) pre-coated black wall, clear bottom 96-well plates (Thermo Scientific). To grow A549 cells into an epithelium-like structure, the cells were allowed to grow fully confluent for 5 days. Cells were treated and Tg-infected as described above. Following fixation with 4% methanol-free formaldehyde (28906, Thermo Scientific), specimens were permeabilised with PermQuench buffer for 30 min at room temperature. Then PermQuench buffer containing 1 μg/ml Hoechst 33342 and 2 ug/ml CellMask™ Deep Red plasma membrane stain (C10046, Invitrogen) was added, and samples were incubated at room temperature for 1 hr. After staining, the specimens were washed with PBS five times and kept in 200 μl PBS per well for imaging.
For recruitment analysis, the cells were prepared as described above, but they were seeded on 1% (

| Data handling and statistics
Data were plotted using Prism 9.0.0 (GraphPad Inc.) and presented as means of experiments as indicated (with usually three technical repeats within each experiment) with error bars as SEM, unless stated otherwise. Significance of results was determined by non-parametric one-way ANOVA or unpaired t test as indicated in the figure legends.
Benjamini, Krieger and Yekutieli false-discovery rate (Q = 5%) based correction for multiple comparisons as implemented in Prism was used when making more than three comparisons.
All open-source KNIME workflows used in this work can be found at: https://github.com/HRMAn-Org/HRMAn and on the homepage hrman.org under GPLv3 open-source software license. The trained