Integrating model simulation tools and cryo-electron microscopy

The power of computer simulations, including machine-learning, has become an inseparable part of scientific analysis of biological data. This has significantly impacted the field of cryogenic electron microscopy (cryo-EM), which has grown dramatically since the “ resolution-revolution. ” Many maps are now solved at 3 – 4 Å or better resolution, although a significant proportion of maps deposited in the Electron Microscopy Data Bank are still at lower resolution, where the positions of atoms cannot be determined unambiguously. Additionally, cryo-EM maps are often characterized by a varying local resolution, partly due to conformational heterogeneity of the imaged molecule. To address such problems, many computational methods have been developed for cryo-EM map reconstruction and atomistic model building. Here, we review the development in algorithms and tools for building models in cryo-EM maps at different resolutions. We describe methods for model building, including rigid and flexible fitting of known models, model validation, small-molecule fitting, and model visualization. We provide examples of how these methods have been used to elucidate the structure and function of dynamic macromolecular machines.

in near-native state. Overcoming the limitations of traditional techniques, such as X-ray crystallography and NMR spectroscopy, in addition to recent advances in direct electron detectors and improvements in image processing methods, have made it possible to routinely obtain resolutions better than 3.5 Å even for low purity samples. 1,2 In fact, in the last 8 years there has been an exponential increase in the number of deposited structures in the Electron Microscopy Data Bank (EMDB). 3 From 4 maps having a resolution better than 4 Å in 2011, 7556 maps have reached this level in 2021 ($40% of the database; Figure 1a).
The most widely-used cryo-EM method is single-particle cryo-EM; where thousands of noisy projection images containing up to 100 s of randomly oriented "particles" (imaged molecules) suspended in vitreous ice are collected in a transmission electron microscope at cryogenic temperatures. These imaged particles are computationally extracted from images, and their relative orientations are determined, using sophisticated software packages, before the particle images are back projected to generate a 3D reconstruction (cryo-EM map) of the original imaged molecule. Another important method is cryo electron tomography (cryo-ET); here a series of projection images are collected while tilting the sample stage. Again, the relative orientations of each image are determined and the images are back projected to generate a 3D reconstruction: a tomogram of the imaged sample. Cryo-ET can elucidate the structure of complex samples, entire cell components, tissues, or full viral particles in situ. [4][5][6] This means that in combination with subtomogram averaging, cryo-ET can be used to generate reconstructions of biological structures in their native, cellular environment. 7 However, the average resolution of reconstruction by cryo-ET from cells is currently 20 Å, although resolution as good as $3.5 Å has recently been achieved for the 70S ribosome bound to an antibiotic inside intact bacterial cells. 8  In almost all cases, after solving a cryo-EM map of a macromolecule, researchers will use it to build an atomistic model, to assist the interpretation of its structure and function. Depending on the resolution of the maps, different methods can be adopted to derive these models (Figure 1b). When the resolution is reaching atomic detail, methods derived from X-ray crystallography can be applied to refine the positions of the atoms within a structure. However, at lower resolutions, where such information is not present, different density segmentation and fitting algorithms have to be used to rigidly fit large portions of structure (e.g., secondary structure or whole chains). In those cases, a combination of multiple methods is often needed to achieve an interpretable model that agrees with the density map.
In cryo-EM/ET experiments, one obtains a snapshot of the conformational heterogeneity associated with a molecule's structural fluctuations and, therefore, the dynamic information of a molecule is intrinsically manifested in the data. However, extracting information from this ensemble of images can be complex, in large part due to the very low signal to noise ratio of single cryo-EM images. 9 Information is typically extracted from this ensemble in one of two ways: first, classification methods can be integrated into the reconstruction process, allowing the 3D reconstruction of multiple conformations from the same set of single-particle images. 10,11 Second, computational methods are designed to extract the dynamic information from the cryo-EM/ÀET density map by flexibly refining an atomic model in the map while fitting it.
Whether it is the determination of maps at multiple conformations or the building of atomistic models, to gain insights into the dynamics of the assembly structure from cryo-EM data, simulation tools are required. Many different computational methods and tools have been developed in the last two decades, and are often bundled together in software suites to perform structure classification, fitting and validation steps at the same time (e.g., CCP-EM, 12 PHENIX, 13 Rosetta, 14 and SCIPION). 15 Below, we describe computational methods that are designed to extract information directly from the map, then continue with methods that combine information from the map and the corresponding atomic model.

| RESOLUTION HETEROGENEITY AND DYNAMICS IN CRYO-EM/-ET
In addition to the flexibility of the imaged macromolecules, the quality and information content of cryo-EM maps can be degraded by artifacts arising due to microscope defects, particle alignment errors, preferred particle orientations and the high noise in collected images. 16 Therefore, quantifying the resolution of cryo-EM reconstructions (i.e., which features in a map arise due to signal generated by the imaged molecule, rather than noise) is necessary. The Fourier Shell Correlation (FSC) is the most commonly used method to do this. The FSC evaluates the similarity between two independently refined halves of a data set (half-maps) in Fourier space, by calculating the average correlation of Fourier components in each resolution (radial) shell. 17 A global resolution estimate can be derived from such FSC curves by finding the frequency shell that drops below a defined threshold, 18 typically 0.143 is used, 19 although alternative cutoff values or regimes have been proposed. [20][21][22] Once the resolution of a map is determined, one can boost the signal amplitude at higher resolutions, which are attenuated by microscope optics and image processing errors. 23 This increases the visibility of the more detailed features in the map, which, depending on the resolution, can include secondary structure elements and amino acid side chains. 24 Regardless of the criteria for the global resolution, there are a number of issues that cannot be fully represented by such measures, including conformational heterogeneity within the sample, as well as preferred orientations and other sources of unequal distribution of angles between particles that represent different views. In fact, the EMDB contains many maps within which the resolution distribution can be broad. Estimates of this heterogeneous "local" resolution can be used to aid the validation of the atomic models associated with the map, identify regions of uncertainty, identify dynamic versus stable regions and give insights into biological processes. A number of methods to determine local resolution estimation have been developed, 25 the first of which is blocres, 26 implemented in the Bsoft software. 27 The method uses a local form of the FSC by moving a window of a predetermined size along the two half-maps. ResMap, 27 which does not rely on FSC calculation to estimate the local resolution, uses local sinusoidal features to assign a resolution to every voxel. MonoRes estimates the local resolution by establishing a comparison (hypothesis test) at different frequencies between the energy of local signals and the energy of noise. 28,29 DeepRes is based on deep learning from filtered atomic models at different frequencies. 30 The map sharpening should reflect the local resolution, that is, which details should be boosted, and which suppressed. 31 There are multiple methods to achieve this, including using model based sharpening, 32 sharpening based on the local resolution, 33 and using deep learning models. 34 Heterogeneous resolution in cryo-EM maps presents a problem in interpretation and further processing, and can even lead to subunits in a complex being unresolved (Figure 2a). One cause of this can be the underlying conformational heterogeneity of the imaged molecule. Sophisticated computational methods have been developed to disentangle this mixed population of imaged molecules (particles) from a cryo-EM experiment into a series of reconstructions that each represent an individual state. 37,38 This can be achieved by grouping sets of particles into discrete individual classes, where, ideally, each class should represent a unique conformation of the imaged molecule ( Figure 2b). As both 2D and 3D classification methods have developed alongside improvements in cryo-EM sample preparation 39,40 and imaging methods, 41,42 subpopulations with increasingly fine structural deviations can be separated. 43 Classification into discrete groups is often done using maximum likelihood, as implemented in RELION and other software packages, but other F I G U R E 2 Local resolution heterogeneity and identifying structural sub-populations from a mixed population using cryo-EM. (a) Cryo-EM reconstruction of the E. coli ribosome in complex with SecA colored based on the local resolution. The global resolution of the map is 3.1 Å, but the SecA component is at considerably lower resolution. Figure adapted from Reference 35. (b) An example classification, and subsequent refinement, for ATP synthase using the cryoSPARC software package. 36 Here, one can see that the ab initio classification isolates three sub-populations of the ATP-synthase (left hand panel), each with a unique structure that is clearly visible after refinement (right hand panel). This figure is modified from Reference 36. statistical methods, such as multivariate statistical analysis (MSA) have been used. 36,[44][45][46][47][48][49][50][51][52] The output 3D classes are normally subjected to further refinement, to boost their resolution. This has been a widely applied method for identifying structural states, with some notable applications being the rotary movement of the ATP synthase, 53 large conformational changes associated with RNA splicing, 54 and the movements of the ribosome associated with translation. 38,55 With these discrete states in hand, one can use additional methods to infer the underlying dynamics that links them. For example, principal component analysis of reconstructed volumes has been used, in conjunction with massive (>300 classes) classifications, to determine the energy landscapes for small molecule binding to the proteasome 56 and for the conformational dynamics of the human spliceosome. 57 A similar approach was used to study glutamate dehydrogenase (GDH): here, the principle components of 28 representative structures derived from molecular dynamics (MD) simulations were analyzed to identify the main motions of the NAD domain. These principal components were used to generate a freeenergy landscape for the NAD domain motion, and the positions of each reconstructed class was mapped onto this landscape by determining the weighted similarity between each model-map combination. 58 A specialized software package, Fluctuating Finite Element Analysis (FFEA), has been developed to analyze large structural movements in potentially very large molecules, 59,60 using low resolution (>5 Å) cryo-EM/-ET maps. In FFEA, each component of a large complex is treated as a single "viscoelastic" unit, which is deformed by the thermodynamics of the solvent.
However, grouping particles into discrete classes requires, to some extent, that the imaged molecule inhabits discrete states itself. This is frequently not the case as many molecules exhibit a more continuous trajectory of movement, which often corresponds to lower local resolution density in a reconstruction. However, specialized methods can extract and correct such molecular motions. 43,[61][62][63][64][65][66] An early example of such an approach was used to visualize the Brownian motions of the yeast cytosolic ribosome fluctuating in solution ( Figure 3). 38 Here, structural subclasses of the ribosome were identified by mapping groups of particles with similar alignment angles (relative to a global reference) to an Ndimensional manifold. This manifold was analyzed in a multistep procedure, to generate a novel mapping of particles, which can be used to identify sub-classes of particles, and to derive the energy landscape sampled by the imaged system. Analyzing this mapping for the ribosome revealed it undergoes motions similar to those associated with elongation, despite the absence of any mRNA and most tRNAs. 38 The popular software RELION uses an alternate "Multi-Body Refinement" implementation, in which a multi-component complex can be manually segmented into multiple "bodies" (masked regions), which are then independently refined using a maximum likelihood approach. 63 The most significant molecular movements can be deduced and visualized after this refinement, by subjecting the relative alignment angles of each body to principal component analysis. CryoSPARC uses a 3D variability analysis algorithm to identify classes from a continuously variable population of particles 62 : here, probabilistic principal component analysis (where principle components are estimated using an expectation maximization algorithm) 67,68 is used to identify the main structural fluctuations within the imaged molecules. This approach was used to identify and visualize large structural movements BOX 1 Machine learning methods commonly used in cryo-EM Autoencoders (AEs): An autoencoder is composed of two parts: the first is an encoder, which transforms input data into a form that is typically compressed and lower dimensional. This is sometimes referred to as the code, or "latent" form. Typically, this latent form is expected to only contain information related to the most prominent identifying aspects of the input data. The second part is the decoder, which converts this latent representation into an output that is as similar to the input data as possible. A more powerful variation of Autoencoders is Variational Autoencoders.
CNNs: Convolutional neural networks (CNNs) are one of the most widely used artificial neural network architectures. These networks effectively scan the input data to extract features. CNNs excel in pattern recognition on image data. 70 These exceptional performances of CNNs are partially due to their ability to model complicated images into much smaller feature maps while keeping the complexity of the model relatively low. Furthermore, CNN kernels read adjacent pixels of an image at a given time, enabling them to learn the locality of images. This advantage is further reinforced by the hierarchical nature of CNN where each layer extracts progressively more abstract features.
U-nets: For pixel level annotation (semantic segmentation), U-nets are the most widely used architecture. CNNs are used as building blocks to build U-nets.
in the human motor protein Dynein, when it was bound to a regulator protein Lis1. These movements took place at the flexible dynein linker domain, and were speculated to include the beginning of the dynein power stroke. 69 Other methods utilize machine learning algorithms to determine the conformational landscape for heterogeneous sets of particles. CryoDRGN uses a variational autoencoder, image encoder-decoder architecture 71 (Box 1) to initially transform particle images and their alignment angles into a low-dimensional latent space, which can then be used to identify clusters of particles that belong to distinct conformations. 43 A similar encoder-decoder architecture is used in e2gmm: a decoder is used to transform the images and their alignment angles into latent space, and an encoder to transform this latent space information into a Gaussian mixture model representation, which can be used to identify structural sub-populations. 64 Another example of a machine learning based approach has been developed that can accurately infer protein dynamics from a cryo-EM map. 72 Here, a 3D convolutional neural network (CNN)(Box 1) was trained with cryo-EM maps (obtained from the EMDB 73 ) and all-atom MD simulations, derived from the fitted models (obtained from the PDB 74 ). The fully trained neural network was able to derive accurate root mean squared fluctuations for all atoms in a protein model, based only on the EM map. These values showed good agreement with all-atom MD simulations and agreed with hydrogen-deuterium exchange experimental data. 72

| DETECTION OF ELEMENTS FROM DENSITY
Detecting important biomolecular elements from cryo-EM maps such as secondary structure elements (SSEs) still remains an open problem although several software tools for the element detection problem have been introduced. This was a particularly relevant problem prior to the "resolution revolution" in cryo-EM, where maps were generally F I G U R E 3 Energy landscape for the ribosome derived from cryo-EM. (a) Three views of a ribosome reconstruction with the main movements labeled with red arrows and letter codes, which are further clarified in the key. (b) The energy landscape that the ribosome traverses. The triangular shape represents the main structures occupied during its fluctuations in solvent, which is divided into 50 states. Seven states are highlighted (numbers) and the arrows and labels show the structural changes undergone as the ribosome transitions between these states. Figure modified from Reference 38. restricted to medium resolution (intermediate, $4-10 Å). However, there are still many maps being deposited in the medium resolution range, resulting from heterogeneous samples as well as sub-tomogram averaging.
One of the earliest software systems to detect biomolecular elements is Helixhunter. 75 Helixhunter detects alphahelix position, orientation and length in density maps of resolutions 6-10 Å. This is using a five-dimensional crosscorrelation search of the 3D cryo-EM map, followed by feature extraction. Sheetminer 76 is a tool that detects β-sheets in 6-10 Å resolution cryo-EM maps using multi-step ad hoc morphological analysis (Figure 4a). Another early work is SSEhunter. 77 SSEhunter detects both ɑ-helices and β-sheets in density maps of intermediate-resolution with the aid of template-based search, density skeletonization, and local geometry calculations. Detecting structural homologues of proteins in density maps is another important task. EMatch 80 is detecting helices in 6-10 Å resolution cryo-EM maps. After detecting helices, EMatch utilizes various computational methods to recognize structural homologues of protein domains. Detection of fully folded domains can be done by fitting domains representatives, as done in SPIEM. 81 EM-fold 70 is a tool to build topological models for large proteins using computational structure prediction methods and experimental cryo-EM density maps. First, α-helical regions are predicted in the density map and in the protein sequence using a consensus of jufo, 82,83 psipred, 84 and sam. 85,86 Then, an in-house Monte Carlo assembly algorithm is used to place the predicted α helices into α-helical density rods of the density map. Finally, Rosetta 87 is used to add loop regions and side-chain coordinates. EM-fold was tested on cryo-EM density maps with $6 Å resolution. Pathwalking 88 is a semi-automated template-free protocol to enumerate Cα models from 3.5 to 6 Å resolution density maps. Pathwalking is capable of rapidly generating initial models for individual proteins using an approach derived from the Traveling Salesman Problem.
VolTrac is a tool to detect helices in 4-10 Å resolution cryo-EM density maps by utilizing a combination of genetic algorithms with stochastic search and a bidirectional expansion with a tabu search strategy to detect helices. 89 SSEtracer 90 is another SSE finding tool that detects helices and β-sheets in $5-10 Å resolution maps using local feature analysis based on the characteristics of the local shape of helices and β-sheets. There are several other tools available to detect SSE and related elements from cryo-EM density maps. [91][92][93] Beta barrel (β-barrel) is a special β-sheet closed toroidal structure. BarrelMiner 94 is a tool developed to automatically detect β-barrels from cryo-EM density maps with resolution between 5.5 and 10 Å. BarrelMiner uses Random Sample Consensus (RANSAC) 95 and a variation of this technique 96 to search for the location and orientation of a β-barrel in 3D density volumes. Various other tools that combine existing tools to harness higher performance to detect SSE have also been developed. 97,98 Given that machine learning is a proven area of study for various pattern recognition problems, such as object detection in images, 99,100 it has been successfully applied to identifying SSEs from cryo-EM maps, and recently applied also to higher resolution maps. One of the earliest tools is SSELearner 78 (Figure 4b). SSELearner detects helices and β-sheets in density maps with the aid of a secondary structure annotator, SSID, that predicts helices and β-strands from the backbone Cα trace. SSELearner uses Support Vector Machine (SVM) to detect SSEs. The predictor was trained on volumetric maps extracted from EMDB and tested on 10 simulated density maps at 8 Å resolution and 13 experimentally derived cryo-EM density maps with resolution between 3.8 and 9 Å. A widely used deep learning algorithm for SSE detection on cryo-EM maps is Convolutional Neural Networks (CNN). A CNN based predictor 101 assigns a label for each voxel in the tomogram and predicts a probability of label for each voxel with respect to SSEs. Another CNN based predictor 102 that predicts backbone structure and Cα atoms along with SSEs. The model was trained on simulated maps produced from 7024 PDB files with various resolutions and tested on 50 experimental maps with the resolution ranging between 2.6 and 4.4 Å.
Haruspex 79 detects Oligonucleotides along with SSEs in cryo-EM density maps ( Figure 4c). The prediction model is trained on 293 experimentally derived EMDB maps with an average resolution of 4 Å or better and tested on a test set of 122 EMDB maps with a resolution of 4.4 Å or better. Emap2sec+ 103 is a new residual network (ResNet) 104 based tool to detect protein secondary structure and nucleic acids in cryo-EM maps. Emap2sec+ classifies each voxel in the density map into four classes, alpha-helix, β-strand, coil, and DNA/RNA. Emap2sec+ is designed for intermediate resolution maps while Haruspex is designed for 4 Å or higher resolution maps.
It is also important to note that the performance of machine learning methods can vary between experimental maps and simulated maps. For instance, SSELearner detects helices on 10 simulated density maps with the averaged specificity and sensitivity of 94.9% and 95.8%, respectively. This performance drops to 91.8% and 74.5%, respectively for 13 experimentally derived cryo-EM density maps obtained from EMDB. For the β-sheet detection problem, SSELearner detects β-sheets on above mentioned 10 simulated density maps with specificity and sensitivity of 86.7% and 96.4%, respectively while this performance drops to 85.2% and 86.5%, respectively, for above mentioned 13 EMDB cryo-EM density maps.
DeepTracer 105 takes the concept of element detection a step further, by determining the position of all atoms in the model, using a machine learning based approach. It is powered by four modified U-Net architectures (Box 1). 106 U-Net is a popular CNN based deep learning architecture for semantic segmentation where each pixel is classified. Each of these U-Nets reads the preprocessed cryo-EM density map and classifies each voxel into a class. The first U-Net (Atoms U-Net) predicts each voxel in the preprocessed cryo-EM density map into the following four classes, alpha-carbon atom, nitrogen atom, carbon atom, and no atom. The second U-Net (Backbone U-Net) classifies each voxel into three classes, backbone, side chain, and not a part of the protein. The third U-Net (Secondary Structure U-Net) predicts the SSE of each voxel. This network classifies each voxel into four classes, loops, sheets, helices, and no structure. Finally, the fourth U-Net (Amino Acid Type U-Net) determines the amino acid type of each voxel. This network classifies each voxel into 21 classes where the first 20 classes represent the standard amino acid type and the last class represents nonamino acid. The final structure modeling is determined by the prediction of all four U-Nets. Deeptracer was used to build an initial model of the yeast Ubr1 ubiquitinase, which was then further refined with phenix (see Section 5). 107

| RIGID-BODY ASSEMBLY FITTING
Solving the structure of very large complexes at high resolution (<4 Å) can still be challenging, often due to flexibility of the imaged molecule, or difficulties with sample preparation. 108 As a result, reconstructions are often restricted to low resolution (>4 Å). In such cases, one can still construct a model of the imaged sample using assembly fitting, whereby the structures of each component are placed into the cryo-EM density as rigid bodies. Given the low resolution, and thus low information, in the map, the overall architecture of the associated complexes can be probed by assembly fitting complemented by other experimental techniques, such as cross-linking Mass Spectrometry (integrative modeling). 109 Modeling into these lower resolution maps remains highly relevant: more than half of the deposited maps in EMDB have a resolution worse than 4 Å, including more than a third (36%) of maps deposited in 2021. At the lower end of resolution, the position of entire protein chains may be ambiguous. For large multimeric assemblies, this presents a significant challenge to assembly fitting procedures.
When attempting to solve the structure of a large protein complex, if the structure of components in the assembly are known, it is possible to place them within the map. Examples of such assemblies can be found on the PDB-Dev repository, 110 which includes large assemblies such as polymerases, ribosomes, nuclear pore complexes, or proteasomes among others. Notably, the assembly of the proteasome and nuclear pore complex were both solved using integrative modeling algorithms and data from cryo-EM, crystallography, and mass spectrometry. 10 When trying to fit multiple components rigidly within a map, each component can be described as having 6 degrees of freedom, 3 to describe its spatial position, and 3 for its orientation. The combination of those degrees of freedom results in a very large number of possible conformations, making it difficult to ascertain the best possible arrangement.
This problem has been tackled with numerous approaches. The search can be accelerated using FFT-based searches, spherical harmonics or Zernike descriptors of the volumes; and the use of maximization methods, gaussian mixtures, or density vectors can iteratively improve the fit until it reaches a local maximum (Table 1). A summary of proposed methods is shown in Table 1, and a description of how assembly fitting can be performed using these tools is provided below.
The original methods that were developed to solve this problem, such as Situs, DOCKEM, or Mod-EM tackled this problem by maximizing the correlation between the map and a blurred model, 111,118,126 but multiple components were typically fitted sequentially.
Alternative representations of the density offer powerful ways to describe and improve the fit of a model to it. gmfit uses a simplified gaussian mixture representation, and optimizes their position and rotation to maximize an energy term maximizing the fit between the sum of the Gaussians and the map while minimizing subunits overlap. 122 Later methods such as Multifit 124 used simultaneous rather than sequential fitting, as initial errors in the fitting procedure Belief propagation Assembline 125 Combined restraint scores based on IMP will impede finding an overall best fit. In the EMLZerD method, Zernike descriptors are used to represent the surface of a complex, and to find best matching components. In γ-Tempy, a genetic algorithm is used to iteratively improve the match between candidate assemblies and the experimental map, as estimated by a global score, usually cross-correlation coefficient (CCC) or mutual information (MI), 117,127 combined with a penalty clash score. The positions and rotations of components within each assembly are initialised using a vector quantisation procedure, 128 producing the initial set of assembly fits. The next set of assemblies is generated by mutating (i.e., randomly changing) the position and rotation values for the components and by combining values from two "parent" assemblies. The set of "parents" and "children" are scored and ranked, and the best candidates are kept for the next set. This method was shown to be capable of producing reasonable fits for assemblies up to eight components. 117 PRISM-EM attempts to generate candidate complexes by using protein-protein docking, minimizing an energy score, and then fitting those complexes in the density map. 121 The two main parts of the scoring in PRISM-EM are a common theme in many assembly fitting methods, with a part related to the density quality of fit, and the other dedicated to scoring the interface. However, the density map is used for scoring and not for optimizing the configuration of the components directly.
Generally, the quality of the fit is limited by the quality of the scoring, and better scoring methods are enabling improvements in fitting methods, as can be seen in VESPER. The latter is a recent method to align cryo-EM maps by converting the density into multiple local vectors that are directed toward high local density. 123 These vector maps are then aligned using an exhaustive search, where the degree of overlap between two maps is determined by the dotproduct between the vector representations of each map. This approach was found to be very accurate for aligning two experimental cryo-EM maps, and for locally aligning one map containing, for example, a single subunit of a complex into another map of the full complex. Using this local map alignment, VESPER can be used for assembly fitting once the model for each component in the assembly has been converted into a simulated map. Using this approach, VESPER was shown to produce more accurate assemblies (i.e., with a smaller RMSD to a known ground truth solution) compared to other popular methods for the TFIIH, γ-secretase, yeast RNA polymerase I, and other complexes. 123 In addition to fitting structures using cryo-EM data, methods have been developed that take advantage of multiple types of data, coming from native mass spectrometry, crosslinking mass spectrometry or small-angle x-ray scattering (SAXS), and other experiments that may provide spatial restraints for the internal organization of a biomolecular complex. Methods that combine multiple experimental information include Haddock, 119,129 IMP, 113 ROSETTA, 130 and power. 131 This type of integrative approach can be used to resolve assemblies that would remain too ambiguous or difficult to solve by direct fitting in the density, the extra information from other methods providing the necessary information to either find the correct solution, or separate it from others fitting similarly well in the density. Assembline combines an IMP assembly pipeline with an efficient sampling procedure to fit very large models into lower resolution (>4 Å) maps, potentially integrating restraints from other experiments (e.g., cross linking mass spectrometry). 125 Efficient computations are achieved by generating "fit-libraries," in which a global search is performed on each subunit independently, using the fit-to-map tool in UCSF Chimera. 116 p-values are derived from these fit libraries, which determines the likelihood that a better fitting position can be identified, relative to each placed subunit. These p-values are integrated into the flexible Assembline scoring function, which can include symmetry restraints, restraints from crosslinking mass spectrometry and pulldown experiments and structural restraints, for example, to favor interactions between trans-membrane domains and lipid bilayers. Structures are refined using Monte Carlo simulated annealing to optimize the placement and rotation of subunits for candidate models, which is greatly accelerated using the precalculated scores from the fit-libraries. This approach was used to generate models for the pathogenic type VII secretion system from Mycobacterium smegmatis, 132 as well as multiple models of the human 10 MDa + Nuclear Pore Complex (NPC). 133,134 Several recent cryo-EM studies of the NPC also highlighted the utility of recent deep-learning structure prediction methods such as AlphaFold2 135 and RoseTTAfold 136 for model building. 133,134,137,138 Two studies used cryo-ET with subtomogram averaging to generate low (20-30 Å) resolution maps of NPCs from yeast. 133,137 In both studies homology models were used to generate models for the components of the NPC, one with Assembline, 133 and the other with IMP. 137 In both cases, these approaches generated large models ($30 MDa), although both maps did contain unmodeled density ( Figure 5). However, a significantly larger scale model was built by using deep-learning methods to predict the structures of all modeled NPC protein components, in addition to acquiring a higher resolution cryo-EM map. This facilitated building a much larger, more complete, 70 MDa, model composed of $1000 subunits, into a 12 Å resolution map of the human NPC ( Figure 5). 134 A similar increase in model "completeness" was also achieved for the cytoplasmic ring of the Xenopus laevis NPC by integrating AlphaFold2 predictions. 138

| FLEXIBLE FITTING AND REFINEMENT
With atomic resolution being reachable, [40][41][42] the range of map quality that is attainable in cryo-EM is very wide, necessitating different computational techniques for interpretation in terms of biomolecular structures. The range of features that can be resolved given a resolution, and in turn the techniques that are applicable to resolve these features, are presented in Figure 1b.
At the highest resolutions (better than 3-3.5 Å), tracing methods can unambiguously place atoms at local density peaks, using known geometries for common moieties, such as amino acids, nucleobases, water, ions, and common ligands. Many of those methods were first used and implemented for the processing of X-ray crystallographic data (e.g., REFMAC 139 and PHENIX 13 ), and were adapted to be used in cryo-EM. At intermediate resolutions, tracing methods such as Pathwalking 140 or MAINMAST 141 place pseudo-atoms within the density, which are then connected together to form the backbone chain. This reconstruction can be improved with manual inspection, as done in Gorgon 142 or Coot 143 (see Section 8). After the initial reconstruction of the backbone chain, specialized tools can be used to identify the nature of amino acid side-chains such as Buccaneer, 144 ARP/wARP, 145 and Coot. 146,147 These procedures can be used or repeated for specific parts of the model, after a well-fitted model could be generated, to further improve the quality of fit. 130,148 But at present, a large fraction of deposited EMDB structures fall into lower resolution ranges, with $57% of deposited entries having a resolution worse than 4 Å, making this type of modeling directly from the density difficult ( Figure 1b). As we move toward slightly lower resolutions, it is required to use an initial model, and iteratively refine its fit to the data. The initial model is often based on either a known atomistic structure from other experiments deposited in the PDB (e.g., from X-ray crystallography, NMR or cryo-EM) or homologous structures. With improvements in predictive models trained on the PDB, and in particular the rise of efficient machine learning methods that take advantage of this same data to create highly accurate models, using a prediction pipeline is becoming more common (see Section 8). We summarize many of the proposed refinement methods in the table below ( Table 2). As the resolution of the map improves, methods that refine smaller structural features can be used. 10 F I G U R E 5 Modeling of the NPC facilitated by accurate protein structure prediction. Three cryo-EM reconstructions of the NPC are shown with the fitted models; from S. pombe (right, PDB-DEV ID: PDBDEV_00000094, EMD-11373), 133 S. cerevisiae (center, PDB ID: 7n9f, EMD-24258), 137 and human (right, PDB ID: 7r5k, EMD-14322). 134 An important technique for the simulation of biomolecules is MD, which allows for the sampling of conformational ensembles, as well as obtaining kinetic information. By integrating the information contained within a cryo-EM density map as an effective potential, it is possible to obtain conformations where the atoms are more likely to be found in regions of high density. MDFF 153 is the archetype of this class of method.
However, using the cryo-EM density as a potential has its own limitations: at lower resolution, the peaks in intensity from multiple atoms will effectively merge into one large peak (e.g., two close-by atoms may produce a single density peak unless the resolution is high enough). Attempting to drive those atoms within this density may result in unnatural structures. Given an estimate of the resolution, the forces can be computed to maximize the correlation between a simulated map at that resolution, and the experimental map. This approach was developed in Flex-EM/MODELER, 152 which optimizes atomic positions using a scoring function that combines the CCC between the structure and the map with stereochemical and nonbonded interaction terms. Optimization by Monte Carlo search, a conjugate-gradients minimization, and simulated annealing MD is applied to a series of subdivisions of the structure. As the optimization progresses, the structure is divided into progressively smaller rigid bodies (e.g., using the RIBFIND clustering approach). 163 The method has been widely used for model fitting into cryo-EM maps at medium resolutions, for example to refine GroEL in multiple conformations, at 7-9 Å resolution 37,164,165 or kinesin 8 bound to microtubules inhibited by the small molecule BTB-1 at $5 Å resolution. 166 More recently, it has been adapted to the high-to-medium resolution range (3-5 Å), 167 often prior to refinement in Coot, 143 and Phenix, 155 as done for the human separase in complex with securin and CDK1-cyclin B1-CKS1. 168 Flex-EM can now be run via the CCP-EM suite. 169 Newer developments in MD-based methods such as CDMD 157 use an effective potential that maximizes this correlation rather than bias directly toward the density itself.
Normal mode calculation is another method that can be used to sample new structures, following low frequency modes, as in NMFF. 149 Normal mode analysis (NMA) is done by computing the second-order derivatives of an energy model (Hessian matrix) given a structure conformation. A commonly used model is an elastic model, with harmonic restraints put on close by atoms; this model can be used to easily compute normal modes, 170 or directly to move atoms. 158 Each mode is one eigenvector, with an associated eigenvalue. The modes with lowest eigenvalue (often referred to as "slow" modes) have been shown to correspond to global motions in protein structures. Protein structures can be refined by following those modes in the direction that maximize a correlation with cryo-EM density data, with limited local deformation of the structure, or used to bias or guide an MD procedure, as in MDeNM-EMFit 160 and NMMD. 171 On the other hand, approaches that attempt to quantify the uncertainty in the data and iteratively improve the quality of the fit with respect to the model provide an alternative approach to obtaining high-quality structures. Maximumlikelihood method is used in REFMAC to improve the fit of a model to the data. 156 This software is also available via CCP-EM. Gaussian mixture models can be used to provide a practical representation of both experimental data and model. 154,172 Some statistical methods use existing structural data present in databases such as the PDB 74 or EMDB 3,73 to provide high-likelihood structures, given a sequence and a map. Recent improvements in deep-learning methods have now found their way into refinement methods, by integrating it into custom pipelines. 135,136  The quality of predictions in the recent CASP14 competition showed striking patterns with respect to the map: regions of lower prediction quality were consistent across predictions from different groups, for a given map and these often correlated with ambiguous density, despite the fact that the models were generated without the knowledge of the cryo-EM map. Some protein domains are likely to have more structural heterogeneity and/or exhibit faster dynamics, which in turn causes the map to exhibit less well defined density for those regions. As larger complexes are being resolved at ever higher resolutions, it has become apparent that reconstruction (see Section 2) and fitting methods need to capture the inherent flexibility of those complexes. This can be done by using consensus approaches, [173][174][175] In turn, it is not sufficient to improve the fit of a single structure to the data, since the density data itself reflects the existence of multiple structures, therefore one would expect that an ensemble of structures will provide a better representation of the data. [176][177][178] Indeed, this has been demonstrated with TEMPy-REFF, a statistical method for real-space refinement based on mixture modeling. The ensemble of models generated via TEMPy-REFF can be expressed as an ensemble map that has a higher quality of fit to the data than any single model-derived map 162 (see Section 7).
Even for single maps, considering an ensemble of structures may prove advantageous: early work on homology and ab initio modeling showed that considering ensemble of structures when fitting to an experimental map is informative both in terms of ranking candidate fits, and to identify best fitting structures. 118,179 This problem has also long been appreciated and addressed in the processing of cryo-EM data, where 2D and 3D classification methods are used to reconstruct multiple densities, corresponding to multiple states (see Section 2). Refinement methods such as MDFF and Flex-EM in medium resolution maps or PHENIX and REFMAC in near-atomic resolution, can be used to generate F I G U R E 6 Cryo-EM model building across varying resolutions. (a) Medium resolution structures and corresponding cryo-EM maps (gray) of the proteasome in two conformations, conformation 1 (purple), and conformation 2 (yellow), bound to a slowly hydrolysing ATP analog. 180 The difference between the two refined models is highlighted (right, corresponding purple and yellow structures overlayed). (b) Example of a very high resolution map of apoferritin. 41 At high threshold (top) only the cryo-EM density for heavy-atoms is visible, whereas at a lower threshold (bottom) hydrogen atoms start to be visible. The difference in cryo-EM density reflects expected electronegativities between atoms. (c) Another example of a very high resolution cryo-EM map of apoferritin where water molecules forming hydrogen-bonding networks are visible in the structure, 42 with observed density in line with the expected partial charge differences between atoms.
well-fitted models for each of these maps, and allow the interpretation of their differences in terms of motions within structural elements. 37 An example of model refinement at multiple conformations (using MDFF) can be seen in Figure 6, where the ATPase subunit of the proteasome has been solved at medium resolution ($7.7 Å).
With improvements in instrumentation, notably optical hardware, and image processing methods, 41,42,181 cryo-EM reconstructions can now reach true atomic resolution. In some maps, the densities of atoms are separate and clearly visible, including hydrogen atoms (Figure 6b,c). 41,42 Fitting models into these very high resolution maps is often performed with methods from X-ray crystallography. For example, a model was refined into a 1.22 Å structure of aptoferritin 41 (Figure 6b) using a combined approach with Coot, 143 phenix.real_space_refine, 155 and REFMAC. 139 Similarly, model refinement into a 1.25 Å map used Coot and REFMAC 42 (Figure 6b). In the example of apoferritin from Reference 41 (Figure 6b), the densities are not centered on nuclei position, and show deformations consistent with charge effects deforming electron clouds as would be present in molecular orbitals. This also paves the way to direct observation of minute changes in different conditions, for example, pH changes, 182 and the observation of ion binding sites and structural water molecules, which can also be used to improve subsequent computational studies of those systems, for example, provide better starting points for MD simulations.

| FITTING SMALL MOLECULES INTO CRYO-EM MAPS
The increase in resolution of deposited cryo-EM structures has increased its relevance in drug discovery. The ability to resolve structures that classically have not been amenable to X-ray crystallography, has brought a wealth of new structures bound to small molecules, giving researchers a better understanding of the mechanism of action of such systems. Accurate small molecule refinement and fitting has been demonstrated in cryo-EM maps with resolutions >3 Å. 183,184 This is particularly notable as it has been thought that higher resolutions would be needed for drug discovery. Although these resolutions are now achievable, the majority ($46%) of maps deposited in 2021 were between 3 and 4 Å resolution (Figure 1).
Methods for fitting small molecules into cryo-EM density maps have two general parts; an initial global fit of the molecule into the cryoEM map, followed by a further refinement in the context of the selected binding site. The chance of success of a fitting protocol is dependent on both the resolution of the map and the ability to generate a good initial approximation of the orientation and conformation of the small molecule within the binding site.
Issues concerning the resolution of the map can be further compounded by the fact that the resolution varies across the cryo-EM map. The local resolution is often worse at small molecule binding sites compared to the surrounding regions. This can be caused by a number of factors including: small movements of the molecule in the binding site, partial occupancy, and capture of heterogeneous binding modes of the molecule.
Should the resolution of the binding site be sufficiently high (generally, <$3 Å) (Figure 7), the molecule can be directly refined into the cryo-EM density using structure refinement methods described above (see Section 5). One common strategy taken is to obtain an initial fit in the software Coot 147 manually place the ligand into the density, followed by using Coot's builtin tools such as the "jiggle-fit" protocol. 185 This protocol locally optimizes the placement of small molecules in density. Given a starting conformation, the translation and rotation values are subjected to small random perturbations. The generated conformations are rigidly fit to the map and the conformation with the best fit is selected and refined further in the context of the binding site, for example using the "phenix.real_space_refine" protocol 155 from the PHENIX software suite 13 (see Section 5).
At high resolutions, generally, all atoms are optimized independently, while ensuring that geometric constraints such as bond lengths, bond angles and dihedral angles, remain within a reasonable range. 155 Geometric restraints for protein atoms are well known and generally have preferred values, this is not true for "non-standard" atoms, such as those in small molecules. Therefore, these geometry constraint values must be calculated before refinement and supplied along with the input. One method of achieving this is with the software eLBOW 186 supplied in the PHENIX package. This protocol has successfully been used in model-fitting pipelines for numerous protein ligand complexes obtained by cryo-EM at high resolutions including the GABAA Receptors in complex with various anesthetics and benzodiazepines (2.55-3.5 Å), 187 the TRPC6 receptor bound to the antagonist AM-1475 (3.08 Å), 188 the GLP-1 receptor complexed with agonist PF-06882961 (2.82-3.24 Å), 189 and the Drosophila solpoke potassium channel bound to various small molecule antagonists (2.59-2.73 Å). 190 More recently, a method for automatic determination of ligand restraints was developed with the addition of the OPLS3e forcefield 191 into the "phenix.real_space_refine" protocol. 192 The OPLS3e forcefield calculates fixed partial charges for small molecule atoms, and generates torsion angle restraints by comparing chemical moieties with values of a built-in database. This updated protocol was benchmarked against the original with a real-space refinement on 15 cryo-EM maps of resolution between 1.9 and 4.3 Å and model quality quantified using MolProbity 193,194 and PHENIX-ramachandran Z-scores. 195 Little difference in model quality between the two protocols was observed, however, the inclusion of the OPLS3e forcefield led to atomic models with a significantly lower ligand strain energy. 192 Another program, GemSpot, 183 leverages molecular docking software to generate an initial fit of the ligand into the map, before applying real-space refinement with the "phenix.real_space_refine" OPLS3e forcefield protocol. 192 Molecular docking aims to identify the approximate positions of atomic coordinates of a small molecule within a binding site, with a scoring function that approximates the free energy of binding and a search function that explores a large portion of the ligand binding site. An exhaustive search of the binding site would be computationally expensive, therefore many molecular docking softwares adjust the dihedral angles, translation and rotational values of small molecules in a random way. Many algorithms have been utilized for this including the genetic algorithm, 196 ant colony optimization, 197 or stochastic hill climbing, 198 to name a few. To speed up convergence to the optimum solution, a local search algorithm is often employed to further refine solutions, examples include the nelder-mead 197 and BFGS algorithms. 198 The accuracy of molecular docking is nowhere near that achieved by MD protocols, however the computational cost is a fraction of that for classical MD. GemSpot 183 utilizes the GLIDE molecular docking software search algorithm 199 with a modified scoring algorithm that is the sum of the GLIDE scoring function and the CCC of the simulated molecule with the experimental density map. This has the advantage of being able to rapidly eliminate conformations of molecules that either have a low CCC to the experimental density map or have a worse docking score. In this way, the software automatically generates initial placements of ligands within the density that both have a good fit to the map and are chemically reasonable. A benchmark of the above mentioned 15 cryo-EM maps (to test "phenix.real_space_refine"/ OPLS3e forcefield protocol) showed that solutions had a comparable CCC with the map as the original deposited structures. 183 GemSpot was also tested at lower resolution, showing greater deviations from the correct solution at 5.5 Å.
More recently, an extension to the PathWalking methodology 200 (see Sections 3 and 5) for the ab initio generation of protein models saw the software identify and fit ligands into a protein model. Given a density map, protein model and map threshold cutoff at which ligand density is visible, the program first masks out density assigned to protein atoms. The remaining density is filtered, and density is assigned as either ligand, ion, or solvent based on the shape and distribution of pseudo-atoms fit to the density. Once density is assigned to a ligand, the ligand model is built and its fit assessed with standard correlation metrics. This method was shown to reliably fit ligands at resolutions better than 3.0 Å, with only partial success at resolutions close to 4.0 Å.
A protocol for fitting small molecules at lower resolutions (3-5 Å) based on the idea of using a molecular docking protocol for initial ligand placement followed by a real space refinement using neural-network potentials to score protein ligand interactions 184 has recently been developed. Once again, the GLIDE docking software 199 is used to achieve an initial fit of the protein/small molecule complex. This is then further refined using MDFF with neural-network potentials for the small molecule (i.e., classical force field parameters were used for protein atoms, while neuralnetwork potentials were used for deriving small molecule parameters).
Neural-network potentials are an alternative approximation to using quantum mechanical calculations of molecular interactions, which, while accurate, are computationally expensive. Neural-network potentials are trained on large ensembles of quantum mechanical data for multiple protein ligand complexes, as a way to overcome the large computational costs of quantum mechanical calculations while retaining the accuracy of such calculations. It was reported that refinement using neural-network potentials improved the radius of convergence (i.e., the distance of the initial fit from the ground truth solution) where a correct solution can be achieved when compared to quantum mechanics/molecular mechanics-MD flexible fitting, when synthetic maps up to 4 Å were used. 184 Furthermore, by giving more weight to interactions between ligand atoms and protein atoms that were shown to be important from biochemical data, correct complexes were recreated with maps of 5 Å resolution.
A further strategy for refining small molecules into cryo-EM maps that has shown success with intermediate resolution maps included the use of correlation derived MD. 157 This protocol was shown to be able to accurately refine an atomic model of an N-ethylmaleimide sensitive factor bound to ATP at 3.9 Å. The model quality of the final refinement was seen to be higher than the deposited model, with a better fit to the map. When the refined model was compared to two high resolution x-ray crystallography structures, the angles of the ATP molecules were seen to be more consistent with what was reported for the x-ray structures than the deposited models.
A strategy for refining small molecules into low resolution (>4.5 Å) cryo-EM maps was touched upon by the paper that reported the use of Neural-network potentials with MD flexible fitting. This report was seen to recreate a correct fit at 5 Å when the strategy used biochemical data, while a refinement without such data failed to replicate a correct fit. The reason for this was that at this resolution density from the protein was bleeding into ligand density in the map. 184 This highlights one of the main challenges of fitting small molecules at this resolution range, where there is simply not enough information for the accurate placement of ligands. Nevertheless, strategies for achieving this have been reported. Two studies report a strategy involving creating a difference map 201 to mask out protein density leaving only density corresponding to the ligand for fitting. 166,202 First, the protein model is refined into the experimental map and a synthetic map is simulated from the protein model. The amplitudes of the two maps are scaled to match and the density in the simulated map is subtracted from the experimental map leaving density corresponding to the ligand as a density difference map. In both reports, molecular docking was used along with the CCC of generated solutions to the difference map in order to find an initial placement of the ligand in the map. The top solutions were then flexibly refined into the experimental map. When both cases the approximate positions of the ligand corresponded well with experimental mutagenesis data, with respect to the distance from residues involved in drug action. However, neither was able to confidently identify a single solution, and in both cases an ensemble of high scoring solutions was reported.
This highlights the difficulty of fitting small molecules at this resolution as there is not enough information present in the experimental density maps for an accurate calculation of protein ligand complexes. However, it does seem that combining fitting with biochemical data can improve calculations.

| VALIDATING MODEL-MAP FIT
Once a model has been generated by rigid or flexible fitting (Sections 5 and 6), it is vital to assess whether it is an accurate and realistic (i.e., stereochemically plausible) representation of the molecule imaged in the context of the cryo-EM map. This is the goal of model validation, where a candidate model can be investigated in greater detail compared to during model refinement, where rapid evaluation of multiple models at each iteration is required. In an ideal scenario, model validation metrics should be both sufficiently informative to the researcher to identify an ideal candidate model/ s, and sufficiently simple to allow non-experts to trust the deposited models that they will use as a basis for further research.

| Model-map fit scoring functions
An essential component of model validation is ensuring that a proposed model is an accurate fit to the experimental cryo-EM density. Ideally, the metric used to assess model fit should not be the same metric, which is optimized during model refinement.
Cross-correlation (CC) is the most common scoring function for assessing the similarity between a map and model but there are other scores, including those which rely on comparing surfaces of maps (e.g., the Normal Vector score or the Chamfer Distance) rather than the entire density. 127,174,203 CC evaluates model fit by assessing the similarity between two pixelated volumes. In the case of model evaluation these would be the experimental cryo-EM map and a simulated map generated from the model (by calculating the normalized product between the densities in these two maps). 203,204 Generally, the two maps are zero-mean normalized during calculation of the CC, such that the CC reports the covariance between them and is not biassed by an offset between density values in each map. Variations of the CC score include, for example calculating the score, only on a subset of the map density values (e.g., on voxels within a mask or voxels above a set threshold). Alternatively, cryo-EM maps can be filtered prior to CC calculation, for example Laplacian filtering, 205 that calculates the second derivative of the cryo-EM density to enhance edge features in the map. These variations are implemented in different software such as PHENIX, as CC mask and CC vol , respectively, 204 and TEMPy. 206 As mentioned above, the Fourier Shell Correlation (FSC) is another commonly used measure of model quality. For model-to-map fit assessment, first, a simulated map must be generated from the model, and the FSC computed between this and an experimental map. A single resolution value can be extracted from the FSC curve at a specified cutoff, generally 0.5, which reports on both model and map quality. 204 Alternatively, it has been shown that integrating the FSC curves across the mid-low resolution range (6-10 Å) produces a score (FSCi) that is slightly more discriminative than real space correlation (CC). 207 Additionally, the FSC average is an implementation of map-model FSC that normalizes the correlation between each shell, based on their number of Fourier components. 156 Finally, by refining the model into one half-map, the FSC curve has been used to detect model overfitting (in which the model is refined against noise in the maps, rather than signal), by using the second half map as an independent test case. 156,157,[207][208][209] Multiple software packages implement the map-model FSC, and its variants, including Phenix, 13,204 Rosetta, 148 REFMAC, 156 and TEMPy. 206 A global score is necessary to evaluate overall model agreement with data, but localized deviations between the model and the experimental map can be obscured within this global average. Such deviations can be identified and investigated using "local" scoring functions (Figure 8), which can assess model fit for example, at the level of secondary structure, individual residues or even single atoms, depending on what the resolution allows. This kind of local assessment can guide the refinement process at different stages of flexible fitting. 167 In the following section, we introduce some popular local scoring functions, and then evaluate studies that have compared their effectiveness for scoring models in cryo-EM maps.
The CC score has been extended for use as a local scoring function in multiple packages. 203,206,212 In Chimera and ChimeraX the CC can be calculated on a selected segment. 116,213 Another example is the TEMPy python library, 174,206 which contains implementations of CC (or scores similar to CC such as Mander's Overlap) as local scores. PHENIX 214 and EMDA 212 also have their own local CC implementations; for example, in Phenix, the CC box score can be applied to local regions around specific molecules or residues. 204 For example, the segment based cross-correlation (SCCC) score is calculated for subsets of the model and corresponding portions of the map, depending on the protein's segmentation into "rigid bodies" (Table 3). These rigid bodies are essentially segments of the protein structure or sequence, for example individual ɑ-helices or β-sheets, and can be identified using the software RIBFIND. 163,218 This approach was used to identify, and flexibly fit, poorly fitting segments of Coxsackievirus A7 capsid protein into two low resolution maps (8.2 and 6.1 Å) that represented specific conformations associated with capsid maturation. 175 F I G U R E 8 Local fit assessment. (a) SMOC scoring to evaluate target T1099 (EMD-10800) from the CASP14 competition. The residueby-residue SMOC plot for three models is shown: the reference structure (blue), the best predicted structure before refinement (orange), and after refinement with TEMPy-REFF 162,210 (red), with the respective models shown in the experimental cryo-EM density (right). For higher resolution cryo-EM maps (<4 Å), it is possible to interrogate the model fit in more detail. EMRinger is a somewhat unique score that focuses on a very specific feature of a protein model: the position of each residue's C γ atom χ 1 dihedral angle. 215 The score is calculated by extracting the map densities around each dihedral angle, and searching for peaks in density values at 60 , 180 , and 300 , which are known to be favored positions for the C γ atom, based on high-resolution X-ray structures. 215,219 This makes the EMRinger score informative for both the local map and model quality, as a clear peak at an expected position indicates a residue is within a well-resolved portion of the map, and that it is positioned correctly.
Another score interrogating the model at the residue level is the SMOC score, which is computed using the Manders' Overlap coefficient (MOC), another early-developed score that captures per-residue local fit. 167 The SMOC is similar to the CCC, but not normalized around the mean values of each map (cryo-EM map and simulated model map). There are two methods available to calculate the SMOC score for a model: SMOCf, which is calculated as the central residue in overlapping windows along the protein sequence, or by selecting map voxels within a cutoff distance to each residue (SMOCd). Scoring each residue in the protein using either of these implementations gives a plot of the local model to map agreement. SMOC was also applied to validate predicted structures submitted to the CASP13 220 and CASP14 210 competitions ( Figure 8).
As cryo-EM maps were reaching atomic, or near atomic (<3 Å), resolution, it became feasible to score the fit of individual atoms in a model. The Q-score measures each atom's "resolvability," by computing a density fall-off around each atom in the model, which is then compared to a reference Gaussian. 216 This reference Gaussian is generated in real space using values based on a high quality, high resolution (1.54 Å) apoferritin map (EMD-9865). This means that the Q-score is closely correlated with the local resolution of the experimental map, as the signal fall-off measurement is essentially probing the local resolution.
FSC-Q is another scoring function that evaluates the fit of individual atoms, using an approach similar to the assessment of local resolution by FSC. 26 The score essentially evaluates the difference between the local resolution of the experimental cryo-EM map, and the local FSC of the model fit into this map. Using this approach, FSC-Q can identify overfitted portions of the model. FSC-Q has been shown to be complementary to real-space correlation-based scores, such as Q-score, which can erroneously give good scores to atoms lying in noise or artifacts in cryo-EM maps (Figure 8b,c).
The relative performance (capacity to identify poorly modeled regions) of many global and local scoring functions was recently assessed in the evaluation of CASP13 cryo-EM model targets 220 and the 2019 Cryo-EM modeling challenge. 221 In both studies the relative performance of multiple scoring functions were evaluated (Q-score was not included in the CASP13 cryo-EM model evaluation), with one important difference between the two assessments: in the CASP13 model evaluation, the submitted models were predicted from the sequence alone (i.e., they were not modeled against the cryo-EM density), and thus the quality of these models varies quite widely. This is in contrast to the 2019 cryo-EM modeling challenge, where research teams were invited to submit models for 4 maps; 3 maps of Apoferritin at reconstructed from the same dataset at varying resolutions (1.8, 2.3, and 3.1 Å) and a horse liver alcohol dehydrogenase (ADH) map at 2.9 Å resolution. This meant that all models were a close fit to density, compared to those submitted to CASP13. The results of the 2019 cryo-EM modeling challenge showed that there was overall poor correlation between scoring functions, and instead three clusters of similarly performing scoring functions were identified (Figure 9a). The first two clusters were notable by their relationship to the global and local resolution of the map: CC related scores (including SMOC, Figure 9a) scaled inversely with map resolution, that is, scores were lower for higher resolution maps. The inverse was true for the cluster of scoring functions that included EMRinger, Q-score, and FSC05, all of which give higher scores on average as the map resolution improves. Finally, a final cluster was identified that contained multiple TEMPy scoring functions and FSC avg score from REFMAC. 156 The scores in this cluster were all sensitive to noise in the solvent region of the cryo-EM map, compared to the other scores, which typically masked the maps around the protein signal. A somewhat similar grouping of scoring functions was observed in the CASP13 evaluation of cryo-EM targets, 220 where most scoring functions showed high pairwise correlation (Figure 9b) except EMRinger, which did not correlate well with any other evaluated scoring function. However, there is the caveat that the models evaluated in CASP13 were predicted based on sequence alone, that is, without access to the cryo-EM density, so "high-resolution" scores like EMRinger, are likely to give poor scores for all but the very best predicted structures, given that correctly placing amino acid side chains in predicted models is very challenging.

| B-factors in model fitting and assessment
In addition to local deviations between the model and map, there is often substantial resolution heterogeneity across cryo-EM maps. [26][27][28] Generally, the resolution decreases toward the edge of the imaged molecule, where residues are increasingly exposed to solvent, and therefore increasingly mobile. This mobility causes blurred density (i.e. worse local resolution) in the cryo-EM map, due to unaligned signal between the averaged particles in reconstruction. Similar effects are observed at mobile portions of molecules, such as loops. This local mobility can be represented within the model using the B-factor parameter (also referred to as atomic displacement factor [ADP] or temperature factor) for each atom in the model. Given B-factors are calculated per-atom, it is only practically possible for maps with resolution better than 5 Å. B-factor refinement is integrated into multiple model refinement methods. 148,161 However, only $50% of submitted models for the 2019 cryo-EM Model Challenge included refined B-factors, 221 and refined B-factors are largely not present in older models from cryo-EM maps. 222 Atomic B-factors can be calculated from cryo-EM maps by multiple methods; cryo-EM maps can be converted into structure factors and B-factors can be calculated as they would be in an x-ray crystallography experiment. 223 Alternatively, many packages calculate B-factors in real space, by optimizing the real space correlation between the map and model, while adjusting the atomic B-factors. 148,161 Finally, B-factors can be approximated using each atom's Qscore, 216,224 as the Q-score interrogates the local signal fall-off and is therefore related to the B-factor. A recent study assessed the accuracy of each of these methods for B-factor determination, 225 by assessing the similarity between B-factor weighted simulated maps and experimental maps. They found the Q-score derived B-factor was the most accurate. However, a downside of the Q-score derived B-factors is they require a global scaling factor, that must be individually optimized for each assessed map. In a more recent study, TEMPy-REFF 162 used B-factors for refinement based on the sigma of the Gaussian function describing each atom, from a Gaussian mixture model representation of the protein.
Taking into account these refined B-factors when simulating cryo-EM maps, by using them as weighting factors, improves the accuracy for local and global scoring functions, relative to maps simulated with a constant blurring factor ( Figure 10a). 162,226 Second, the B-factors themselves can act as a tool for validation, since they reflect the local flexibility of the individual atoms. Indeed, such methods were used in part to identify fraudulent x-ray crystallographic structures. 227 Singharoy et al. used the root mean square fluctuation (RMSF) as a similar value to B-factor. 150 The RMSF is calculated based on the MD flexible fitting (MDFF) method, which provides a distribution of fitted models. The local energy of these simulations scale based on the potentials within the cryo-EM map, meaning the RMSF score scales with the local quality of the map and reports directly on the local "reliability" of the model. Similar approaches have been used for both model and map validation. 148,[228][229][230] In another study, 228 10 individual models were calculated for each chain in the refined structure, and the standard deviation of each atom between the 10 structures was used to define an approximate B-factor. This idea of using an ensemble to model the uncertainty in the cryo-EM map (Figure 10b) has been used in multiple studies, including modeling of Tobacco Mosaic Virus structures at 4.3 Å 177 and in fitting into medium-to-low resolution maps. 174 This ensemble representation is similar to that used in NMR, where multiple models that are consistent with the experimental data are often deposited. 231 This approach was extended in a recent study 162 where B-factors derived from a GMM were used to generate model ensembles. These ensembles were then used to generate an ensemble map, which was shown to be a more accurate representation of the experimental cryo-EM density than the one obtained with maps simulated from a single model (Figure 10c). Indeed the authors showed that the correlation increases as more models are included in the ensemble, up to a saturation of the CCC (Figure 10d).

| Protein stereo-chemistry methods
Validating a model must not only take into account the fit to cryo-EM density, but also evaluate the agreement with known stereochemical properties of the imaged molecules. Typically, this assessment is done by comparing the geometry and stereochemistry of a candidate model to known baselines generated from reference datasets of high quality models. However, one must exercise caution when interpreting these values; it is commonplace to apply geometric restraints when refining models into maps with resolutions worse than 3 Å. These restraints include tightly restricting Ramachandran angles and backbone angles. 232 Model refinement methods that integrate MD simulations also intrinsically restrain some stereochemical properties through the force fields used in refinement, leading to models with less outliers, compared to those produced by other methods. 150,162 This can complicate model validation; applying excessive structural restraints during refinement can generate models that appear perfect, due to an absence of atom clashes or unfavorable stereochemical arrangements, but which are a poor or incorrect fit to the experimental data. 162,233,234 MolProbity is the most widely-used validation package for model assessment. Originally developed to validate models from X-ray crystallography, MolProbity scores models by comparing a candidate model to a carefully curated selection of 8000 high-quality and reliable models (termed the Top8000). 233 Using this approach, MolProbity identifies problematic features, including Ramachandran outliers, deviations in bond lengths, isomeric outliers, and many other properties, including some specific to RNA and DNA. 194,233,235 This toolbox was more recently extended to specifically deal with common problems present in models derived from medium-resolution cryo-EM maps (>3 Å) with the CaBLAM (C-Alpha Based Low-Resolution Annotation Method) tool. 233,236 CaBLAM uses Cα angular distributions to assess main-chain geometry and identify secondary structure. Although the Cα trace is a prominent feature in cryo-EM maps at $3 Å resolution range, the carbonyl bonds of the backbone are not visible, and peptide bond angles are commonly incorrectly orientated. CaBLAM uses a sliding 5-residue window to detect such Cα outliers directly, flagging backbone and conformation mis-assignments directly. If uncorrected, these deviations can have an additive effect on downstream residues, leading to large Ramachandran outliers that are often moved to local minima when corrected, sometimes masking the underlying problem rather than fixing it. 233 This issue was well demonstrated in the 2019 Cryo-EM modeling challenge, where more than half of submitted models contained no Ramachandran outliers, while only four had no CaBLAM outliers. 221 Other validation functions include statistical potentials of mean force by comparing a candidate protein to a reference dataset, including DOPE 237 and QMEAN. 238 More recently, AI-based approaches have been used to integrate multiple structural features into a combined score, these include ProQ3D, 239 the MODFOLD7 server, 240 and multiple scores calculated using the DeepRank network. 241 These methods have been used successfully to score predicted models in the CASP competitions, 242 but are also applicable to models refined into cryo-EM density.
Recently, machine-learning methods were also used to develop a more specific scoring function, Pi-score, that focuses on evaluating the quality of protein-protein interfaces between subunits in a model. 243 Here, 12 interface features were used derived from high-resolution crystallographic datasets to train a support vector machine (SVM) classifier. The SVM was then used to score the interfaces in candidate models, based on their classification with the positive or negative datasets, and their distance from the hyperplane. The Pi-score was shown to be an effective validation score for CASP13 models, correlating well with other multimer scoring functions 243 and has recently been used to aid in model building for the massive NPC structure assembled partly from Alphafold2 predicted models. 134

| VISUALIZATION TOOLS FOR CRYO-EM STRUCTURE AND MAPS
A core, if somewhat obvious, objective in structural biology is to learn something new about the imaged molecule by visually inspecting its generated model. The first atomic structures were solved by x-ray crystallographic experiments with the aid of drawings and the assembly of physical ball-and-stick models. With the advent of computers with graphical displays, the visualization process shifted to software-based rendering. Today, the importance of visualization in the field of structural biology, is reflected by the sheer number of visualization tools available. Various ways of illustrating chemical and molecular features have been developed to aid various tasks. For example, visualizing atomic models is often enhanced with surfaces rendered at the van der Waals radius of the atoms. These surfaces can be colored according to local electrostatic and hydrophobic potentials, which could be useful for tasks such as computational drug design. In contrast, Richardson "ribbon" diagrams eschew atomic details instead focusing on the path of the backbone and the secondary structures it forms. Today, ribbon diagrams are now perhaps the most ubiquitous form of protein depiction in use.
While software has made the depiction of models a mostly automated task, building the models still requires significant user intervention. In crystallography, where high resolutions are frequently achieved, visual inspection of model and map in a residue-by-residue fashion is recommended. 244 This advice is also valid for high-to-medium resolution cryo-EM maps. However, when only lower resolution is achieved, integrative approaches are often needed, and these, in addition to the cryo-EM information, can include existing solved structures, machine-learning models, chemical and physical restraint libraries, as well as expert insight. Regardless of the source of information or the computational methods used, it is often ultimately the responsibility of the biologists to assess these theoretical structures and their various scores and criteria. Thus, visualization in the domain of model building has an important additional aspect: to aid the model building process by conveying information from these different sources.
Before exploring the various tools, which are now available, it is worth briefly exploring how we have arrived with such a diverse offering of software. Many of these tools were developed for specific tasks. VMD for example, 245 was developed to visualize MD simulations from the NAMD software. 246 Due to the complexity and expertise required to build robust visualization software, many of the most popular tools have convergently evolved to support plugins and extension mechanisms, allowing their powerful graphical capabilities to be adapted to new tasks. VMD now supports other MD engines such as GROMACs. 247 Chimera 116 and its successor ChimeraX 213 were also originally developed with extensibility as a central goal, largely due to the limited capacity for extension in their predecessor software Midas. 116,248 While the visualization software space may seem overcrowded, competition between packages has been an important source of innovation. One of the challenges of presenting 3D data on a 2D display is conveying depth. A number of techniques to improve "depth cueing" such as ambient occlusion were pioneered in QuteMol 249 and were swiftly adopted by other visualization tools. In the Richardson group, independent implementations of highdimensional visualization features in two kinemage programs KiNG 250 and Mage 251 ultimately lead to a "more generalpurpose and more powerful" implementation. 250 Visualization tools typically provide more than just the ability to inspect atomic models and volumetric data, but also the ability to either build or modify atomic models. Many map related tasks such as sharpening, cropping, and contouring are also greatly aided by the immediate feedback of interactive graphical tools. Tools such as Coot 143 aid this manual validation and fitting process and offer tools for fixing incorrectly modeled regions such as jiggle-fit 185 and backrub. 252 ISOLDE, 253 which is developed as a plugin for ChimeraX and builds on the density guided force-field of MDFF, 153 provides an interactive MD simulation for fitting atomic models to density. ISOLDE and Coot are both aimed at high-resolution volumetric data where density often provides enough information to identify and fit models by visual inspection. While it is still recommended to inspect the fit to density of the resultant models using visualization software, 244 as the resolution worsens, visually assessing fit becomes increasingly difficult as atoms and residues become less visible. However, secondary structures such as alpha-helices and to a lesser extent beta-sheets, remain recognizable both by eye and using automated techniques, between $4.5 and 10 Å resolution. Gorgon 142 is an interactive tool, which combines secondary structure detection using SSEHunter 77 and sequence data with visualization. Sculptor 112 combines the pattern matching and feature vector based fitting methods of Situs 111 along with interactive peak picking for exploring candidate solutions. Gorgon and Sculptor are both standalone tools which may explain why they never gained significant traction. Plugins and integrated tools get to exploit the features and inertia of the host visualization software and its ecosystem. Below are some popular tools used to visualize cryo-EM maps and fitted models:

| ChimeraX
As the successor to Chimera, ChimeraX builds on the core and extensions architecture of its predecessor but offers improved graphics, performance, and new features. ChimeraX's new rendering engine supports an implementation of ambient occlusion that runs on commodity hardware. 254 To handle the ever larger molecules and assemblies which are being generated, representation of biological information is now largely achieved through performant array-based data structures rather than the earlier MMTK 255 structures used in Chimera. Besides excellent tools for working with density maps, ChimeraX offers support for the mmCIF format, which is becoming an important exchange format in the integrative/hybrid modeling community. 110,113 With new widgets for visualizing information such as from cross-linking experiments, ChimeraX is an excellent choice for exploring such data. The most recent 1.3.0 release offers integration with the new EBI AlphaFold2 database 135,256 as well as tools to run AlphaFold2 on amino acid sequences.

| Isolde
ISOLDE is a popular plugin for ChimeraX that extends its visualization and presentation of data, by offering a mechanism for fitting structures to density interactively ( Figure 11). ISOLDE offers an MDFF style density-based potential along with adaptive distance restraints. 257 The user plays a pivotal role in fitting the structure by adding restraints and dragging atoms into density. Like Coot, ISOLDE provides a number of useful visual cues such as an interactive Ramachandran plot and a traffic light coloring scheme to indicate C-alpha atoms with problematic torsion angles. ISOLDE also offers adaptive distance restraints for maintaining local geometry.

| iTEMPy
iTEMPy (Interactive TEMPy) is a TEMPy2 206 plugin for ChimeraX which aims to provide an accessible user interface for scoring and visualizing models at worse than 5 Å. Like ISOLDE, scoring and fitting occur simultaneously and can be directed by the user interactively. However, unlike ISOLDE, which focuses its validation on model geometry, Interactive TEMPy offers feedback on local model-map fit using SMOC scores. 167 This is necessary as at medium and worse resolutions it is difficult to judge the fit by eye. The tool aims to provide a flexible environment for working with models and maps allowing users to use ChimeraX's facilities for fitting models and shaping maps. Due to the low resolutions targeted by the tool, a combination of MDFF style density-driven fitting and a rigid-body fitting approach similar to Flex-EM are provided. Here, users are able to break the model into small rigid-bodies by either manually grouping secondary structure elements or by using the integrated RIBFIND 218 widget. Large proteins and cryo-EM maps are particularly challenging for visual inspection. A SMOC plot which allows interactive "zoning" of the density map aids this process.

| Coot
A popular tool for model building and refinement, Coot 143 provides a set of features for working with high resolution (2.5-4 Å) density maps. In addition to backrub and jiggle-fit, Coot offers routines for loop modeling and "fragment" (short polypeptides sequences) oriented tools. Coot is extendable through Guile Scheme (https://www.gnu.org/software/guile/) and Python scripts. Besides visualizing structures and density, a number of methods are provided to visualize various validation metrics and make their inspection more ergonomic. Such features include the interactive Ramachandran plot as well as cues for clashes and a per-residue fit-to-density scores. A recent addition which greatly aids flexible fitting of models are the German-McClure adaptive distance restraints. 257 These are able to maintain local-geometry but become weaker over longer distances allowing long range flexibility and conformational changes to take place.

| VMD
Originally designed to render MD trajectories from the NAMD package, 246 VMD 245 has a long history of visualizing dynamic molecular information. MDFF 153 and xMDFF 258 are plugins for VMD which pioneered the density based F I G U R E 1 1 Snapshots from ISOLDE 253 implementation in ChimeraX. 213 (a) An interactive Ramachandran plot, which shows in realtime the Ramachandran angles of the atomic model as the simulation progresses. Residues with problematic geometry can be clicked on (indicated by red arrow) which centres the 3D view on the residue (b, outlier residue highlighted with red ball). ISOLDE offers a number of facilities to correct these issues automatically or by manually tugging on atoms.
fitting approaches now used in ISOLDE. NMWiz 259 is another plugin which allows the normal modes of molecules to be calculated and visualized with the aid of arrows indicating vibrational modes. VMD allows extensions to be written in C, C++, Tcl/Tk, and Python. VMD supports an optional ray-tracing rendering engine for the production of highquality images, however a CUDA compatible NVIDIA graphics card is required to use this feature in real-time.

| PyMOL
PyMOL is a popular tool for visualization of models, particularly for drug design and rendering high-quality 3D images. As the name suggests, PyMOL is scripted via the Python programming language. Although the programming language was then relatively young, the choice paid off, with Python now being a popular programming language in the scientific community. Originally developed as an open-source project under the more permissible Python License it is now distributed as proprietary software. PyMOL has an inbuilt ray tracer for producing photo realistic renderings and can optionally export scenes for the PovRay renderer. This has made PyMOL a popular tool for producing images for publication. Its extensibility has led to the development of a wide range of plugins for tools such as AMBER 260 and AutoDock Vina. 198 PyMOL has the capacity to render density from x-ray and cryo-EM experiments, however it offers less functionality for modifying this type of data compared to ChimeraX.

| Online visualization tools
Much work has been done in making data more accessible and transparent. 261 Since the establishment of data repositories such as the wwPDB 74 and the EMDB 73,262 (for atomic models and cryo-EM maps, respectively), attention has been drawn to the complexity in interpreting this data. In the cryo-EM community it is now commonplace for authors to publish the recommended contouring level to make visual assessment of models reproducible. While this can help with interpreting map/model fit, X-ray models, and cryo-EM maps are themselves an interpretation of the experimental data they are derived from. Despite the large of size of the raw data, scientists are now encouraged to archive this information as well in the IRRMC 263 and EMPIAR 264 (X-ray diffraction and cryoEM image stacks, respectively). In order to make the assessment of newly derived models and maps easier, Molstack 265 offers a web based tool with a split screen Coot-like interface, allowing two models and maps to be easily visually compared. Indeed, web based interactive visualization is becoming more mainstream. Since the development of key technologies such as WebGL, 266 numerous libraries and platforms for creating 3D graphics have appeared. Structural data banks have been swift to adopt these new tools such as Mol* 267 and NGL 268,269 for the presentation of models, replacing the static images which struggled to convey the 3D nature of molecules. These general purpose visualization tools tend to focus on the display of the structure in an unbiased manner, however, scientific communication in journals often requires more illustrative techniques to convey information. Scientists use visualization software, color, various atomic representations and potentially arrows and labels to create illustrations, which draw attention to features of interest while at the same time maintaining relevant context. However, these carefully crafted images are often still difficult to interpret as they poorly convey depth. Supplementary videos may help with interpretation, but frequently, exploring molecules from different angles is required to grasp the structure. 3dRS 270 is an example of a web based tool which addresses this requirement. Besides being able to share and preserve a particular representation of a molecular scene, it also allows users to create their own graphical representations using a forking mechanism similar to that found in version control software like Git. 271 With protein dynamics intrinsic to function as well the interpretation of density, 3dRS has the capacity to animate trajectories. Perhaps, one of the key advantages of the web platform is its accessibility, with no requirement for the user to install additional software. EzMol 272 capitalizes on this simplicity, offering a tool which aims to be beginner friendly through the use of a "wizard" for step-by-step configuration of a visual scene. Adoption of tools such as 3dRS by e-journals will likely save scientists time in developing visualizations as well as improving their audience's understanding of their findings.
While the web offers significant promise for communicating scientific findings, discussing structures in a meeting or with distant collaborators can be difficult even when in the same room. Participants may struggle to communicate what feature they are looking at, particularly when faced with an unfamiliar protein. ChimeraX offers functionality to conduct meetings in a virtual reality setting. Pointing devices allow attendees to address features of a structure and for other participants to see what they are looking at, improving discourse. Virtual reality glasses also offer stereoscopic vision, which improves depth perception. Although virtual reality assists in 3D perception of objects and offers an exciting new way of exploring molecules, 3D printing 273 is allowing rapid construction of physical models.

| Machine learning and structure prediction
One of the compelling aspects of machine learning methods is their ability to "learn" tasks which are difficult to describe in an algorithmic fashion. Feature detection, which is described earlier, is an important part of fitting structures both by eye and with the aid of software. Recently, a combination of Haruspex feature detection and SMOC scores were used to identify problematic structures in a large collection of published models. 274 This is an excellent example of how scoring and machine learning can be used to lighten the cognitive load on users by drawing attention to regions of high or low confidence.
A new notion of confidence in the form of plDDT scores from AlphaFold2 135,275 and related machine learning methods 136 are proving beneficial to interactive fitting applications. In addition to the groundbreaking accuracy of the predicted structures themselves, the confidence scores offer another technique for gaining insight into regions of rigidity such as domains and the flexibility between them. Recent versions of ISOLDE for example can treat this confidence as a form of restraint, which can be applied to fitting structures.

| OUTLOOK AND CONCLUSIONS
Cryo-EM has become the structural biology technique of choice for molecules and complexes larger than $150 kDa. Both in vitro purified samples, and molecules in their native cellular environment can now be visualized at resolutions better than 3.5 Å. 2,8 Indeed, for a subset of sufficiently stable samples, true atomistic resolution is now achievable by cryo-EM. 41,42 However, for a variety of reasons, including difficulties such as low number of particles, preferred orientation, and inherent flexibility of structures, many samples are still confined to the medium resolution range or even lower ( Figure 1). Indeed, the resolution is often heterogeneous within single maps, including those solved at a nominal high resolution. As a result of this broad resolution range, the landscape of model building, refinement and validation tools in cryo-EM is similarly wide-ranging, and is continuously evolving (Figure 12).
Perhaps the most notable recent development in the structural biology field is the highly accurate machine learning based protein structure prediction methods. 135,136,[276][277][278] These methods are already having an impact in cryo-EM model building: multiple model building softwares have specific functionalities to integrate predicted models into their model refinement protocols. 213,214,257 Such models have already been used in assembly fitting and/or refinement into cryo-EM maps, when the structures of some of the individual subunits is unknown. 134,138,279 In addition to these recent advances, we expect further progress in structure prediction of protein-RNA/-DNA complexes, protein-ligand complexes, as well as in integrating post-translational modifications.
The development of machine-learning methods is also accelerating in other applications relevant to cryo-EM, both in the image processing as well as model building and validation. 34,43,64,105,243 For example, machine learning-based image classification methods can help to identify more distinct states within cryo-EM datasets. Mapping this conformational heterogeneity of an imaged molecule is a unique and powerful capability of cryo-EM and could serve as a basis to better understand cellular mechanisms, particularly by being complemented with MD simulations. MD methods are important in "connecting" between states and simulating short-lived conformations and/or interactions with lipid membranes. 280 However, such simulations are generally restricted by large computational requirements, particularly when working with large complexes such as the ones typically solved by cryo-EM. There has been recent work to address this problem, including with coarse-grained simulations, which are useful for modeling of sub-cellular systems with large complexity, [280][281][282] or by using novel computational representations of protein complexes. 59,60 Further work will also likely also focus on how local resolution and model validation could be used to explore the inherent flexibility of protein assemblies observed by cryo-EM.
Another direction is the integration of cryo-EM modeling approaches with small-molecule structure based drug design (SBDD). A recent paper has demonstrated successful fragment screening using high resolution (<3 Å) cryo-EM, 283 a technique that until now has almost exclusively been applied to X-ray crystallography. In the study it was shown that not only was cryo-EM able to unambiguously resolve the positions of small fragment molecules and sidechain conformations changes upon binding, but also ligand density corresponding to fragments could be unambiguously assigned when protein targets were incubated with multiple fragment molecules. Crucially, using cryo-EM for SBDD increases the number of potential targets, as many disease related proteins, including many membrane proteins, are not amenable to X-ray crystallography. However, as mentioned above, most maps are still at a resolution worse than 3 Å, including many potential new drug targets. We expect that in the future more methods integrating underutilized z F I G U R E 1 2 Flowchart depicting the cryo-EM model building process.

of 38
cryo-EM maps (>3.5-4.5 Å resolution) with small molecule docking (and machine learning approaches) will be developed to take advantage of these potential new targets.
Finally, we expect further consolidation of new software and computational developments into large software suites. [12][13][14][15] These are gaining popularity as accessibility of scientific software can be a problem: new methods are often developed as command-line (CLI) applications, which can be useful for experts but intimidating to novices, who may wish to use these techniques on their experimental data. Graphical User Interfaces (GUIs) offer a more user-friendly approach but typically take more time to develop and maintain. For example, the CCP-EM software suite offers an easily installable graphical frontend to useful methods for reconstruction, fitting, refinement and validation and also includes visualization tools. 12 One of the advantages of the CLI approach is the possibility to build new tools or protocols by composing individual programs into pipelines. CCP-EM and Phenix 13 along with tools such as Cow-EM and Scipion 15 attempt to give some of this power to less technical users, by providing mechanisms for designing pipelines using either a "visual programming" model or a GUI for defining computational workflows.