Energy–Structure–Function Maps: Cartography for Materials Discovery

Some of the most successful approaches to structural design in materials chemistry have exploited strong directional bonds, whose geometric reliability lends predictability to solid‐state assembly. For example, metal–organic frameworks are an important design platform in materials chemistry. By contrast, the structure of molecular crystals is defined by a balance of weaker intermolecular forces, and small changes to the molecular building blocks can lead to large changes in crystal packing. Hence, empirical rules are inherently less reliable for engineering the structures of molecular solids. Energy–structure–function (ESF) maps are a new approach for the discovery of functional organic crystals. These maps fuse crystal‐structure prediction with the computation of physical properties to allow researchers to choose the most promising molecule for a given application, prior to its synthesis. ESF maps were used recently to discover a highly porous molecular crystal that has a high methane deliverable capacity and the lowest density molecular crystal reported to date (r = 0.41 g cm−3, SABET = 3425 m2 g−1). Progress in this field is reviewed, with emphasis on the future opportunities and challenges for a design strategy based on computed ESF maps.


Introduction
Extended frameworks such as porous coordination polymers (PCPs), [1] metal-organic frameworks (MOFs), [2] and covalent organic frameworks (COFs) [3] have changed the face of materials chemistry. The crowning achievement has been the development of modular synthetic strategies where frameworks can be built to order because the linkers assemble in an intuitively predictable fashion. This predictability relies on a specific, directional functionality into materials to ensure that the components assemble correctly.
A different strategy is to predict molecular assembly, rather than coerce or systematize it: that is, to introduce only the basic molecular functionality that is desired for a given application (e.g., light absorption, acidity/basicity, a shape-persistent cavity, etc.) and then to identify such molecules that happen to assemble in an appropriate fashion-for example, to provide high charge carrier mobility or to render the material porous. This a strategy of selection-or exploration-more than purposeful solid-state design, and it relates to ideas discussed by Jansen and Schön, [8] and the concept that "humans are the explorers, and not the creators of chemical worlds." The differences here are more than philosophical; such strategies might, ultimately, uncover nonintuitive materials with properties that are hard to access by more conventional crystal engineering.
There is an obvious problem: in lieu of a complete and predictive understanding of molecular assembly, how do we select the appropriate molecular building blocks? One experimental approach is high-throughput screening, but even with sophisticated robotics, there is usually a property measurement bottleneck that thwarts brute-force screening, especially for materials with hard-to-measure functions, such as magnetism, porosity, or charge carrier mobility, which are not amenable to automated solution-based assays. An alternative approach is the computational prediction of molecular crystal structures and their associated physical properties (Figure 1a). De novo computational strategies are applicable to both hypothetical chemical species and known molecules. In principle, the screening can therefore be brought forward in the discovery process, in advance of molecular synthesis, which is a particular advantage for complex molecular building blocks that might require months of synthetic work to access even milligram quantities for preliminary testing.
Computational methods for crystal structure prediction (CSP) involve a global exploration of the multidimensional lattice energy surface for stable energy minima, and an assessment of the relative stabilities of the resulting structures. An exhaustive search of crystal packing possibilities using sufficiently accurate lattice energy calculations is a demanding computational exercise. The computational cost of the methods and unreliability of the off-the-shelf force fields used in early implementations of CSP [9] have limited the widespread use of these methods. The application of CSP to organic molecules has largely focused on the anticipation of polymorphism in pharmaceutical materials, [10] where the desired output is a complete, detailed understanding of the possible crystal structures. In this application area, the use of solid-state density functional theory (DFT) for assessing relative stabilities has gained popularity; [11] the high cost of DFT can be justified when the computational effort is focused on the crystal packing of one target molecule. Applications of CSP to functional materials have Adv. Mater. 2018, 30,1704944 Figure 1. a) General scheme for the selection of functional materials using de novo structure and property prediction. Starting with a conceptual molecule chosen from "chemical space," the molecular structure is predicted, followed by the crystal structures. The physical properties are then computed from the predicted crystal structures to describe the "function space." b) Scheme for crystal-structure prediction, starting with a chemical diagram; each point on the density-lattice energy plot represents a local minimum structure on the lattice energy landscape.
been less common, [12] at least for molecular crystals, aside from a relatively small number of studies for small-molecule organic semiconductors, [13] pigments, [14] fluorescent molecules, [15] and explosives. [16] However, the development of efficient, accurate force fields, coupled with highly parallelized structure searching methods [17] and rapid improvements in computational hardware mean that it is now possible to apply CSP in screening applications for the discovery of functional organic materials ( Figure 1b) and to couple this with property predictions. Recently, this has led to materials with unprecedented properties, such as the lowest density molecular solid. [18] 2. Crystal Structure Prediction for Functional Organic Materials

Porous Organic Cages
One important insight from applications of CSP to molecular organic crystals is the close energetic spacing of structurally disparate crystal structures of the same molecule, which results from the weak nature of intermolecular interactions. A consequence of small energy differences between different packing motifs is that small changes to the molecular structure easily shake up their energetic ordering. Thus, even closely related molecules can crystallize quite differently to produce materials with divergent physical properties.
As an example from the field of porous molecular crystals, [19] the organic cage molecule CC1 is nonporous because the cage cavities are disconnected in the crystal structure (Figure 2a), but a close analogue, CC2, has 1D pores between the cages. A second analogue, CC3, has a 3D interconnected diamondoid pore structure that runs through the cages themselves. [20] As such, the solid-state properties of the three cages shown in Figure 2a cannot be anticipated intuitively from the structure of the cage molecules alone, and these structurally related cages do not form isostructural crystals or isoreticular pore networks. Application of CSP to such molecules, however, can predict the most stable crystalline forms correctly based on knowledge of only the molecular formulae ( Figure 2b). [21] CSP calculations also explain the marked preference for certain organic cages to crystallize as heterochiral crystals, rather than in homochiral forms, as manifested by the large calculated lattice energy gaps for the (CC3-R, CC3-S) racemate relative to enantiomerically pure CC3-R (32.1 kJ mol −1 for A vs B) and the (CC1-S, CC3-R) quasiracemate versus homochiral cage cocrystal (18.8 kJ mol −1 for structure E vs structure F). The heterochiral preference is mostly a result of favorable intermolecular interactions between cage windows of opposite chirality. However, this chiral pairing rule-which holds quite broadly for small [4+6] imine cages derived from triformylbenzene [22] -breaks down for a larger analogue, CC5, even though that cage is produced using the same chemistry and it has the same molecular symmetry. Racemic CC5 crystallizes as a chiral conglomerate in all solvents tested, rather than as a crystalline racemate. This illustrates again the nonintuitive nature of molecular assembly, but the effect is rationalized by CSP: there is a 32.5 kJ mol −1 energy gain in forming the chiral conglomerate for CC5, as opposed to the racemate (Figure 2b; C vs D). [21] These large energy gaps are atypical: for other, less directional cage molecules, such as CC1, the calculated lattice energy differences between structures were smaller [23] and in-line with the landscapes of most organic molecules that have been studied by CSP. [24] Hence, CSP methods provide a means to distinguish between molecules such as CC3 with deep-lying, unique global minimathat is, "robust tectons"-and molecules such as CC1 that are more likely to exhibit polymorphism. [25] To reduce computational expense, these CSP calculations were performed on the neat cage structures, without considering the possible effects of enclathrated crystallization solvent on relative stabilities. However, the lattice energy differences between the candidate structures are sufficiently large to allow predictions that agree with experiment. Also, the cages were treated as rigid in these simulations. In other cases, inclusion of molecular flexibility in the calculations is required to capture the experimental crystallization behavior: this can succeed, [23] albeit with increased computational cost. CSP approaches can also be applied to porous cages with different geometries, such as ditopic tubular imine cages [26] with the same window geometry as CC3. Again, the energy landscapes revealed using CSP rationalize the preservation of the chiral window pairing in these systems, suggesting that chiral window-window interactions between cages are strong and directional enough to be compared with directional coordination bonds in PCPs and MOFs.
For these organic cages, the main property of interestporosity-is prefabricated into the cage molecule itself. As a result, a high percentage of the hypothetical structures on the CSP energy landscapes in Figure 2b are predicted to be porous. In fact, the lowest energy predicted structures for both chiral CC3-R and racemic (CC3-R, CC3-S) are also the densest predicted structures-they minimize the free volume in the crystal-and indeed experiments show that amorphous CC3 is significantly more porous than its most stable crystalline forms. [22a] Hence, these energy landscapes offer an insight into possible design strategies for the future: one way to increase porosity in organic cage crystals is to make the cage molecules larger [27] (e.g., surface areas of CC3-R and CC5-R are ≈400 and ≈1300 m 2 g −1 , respectively), [21] while another strategy is to identify cages that pack less effectively [28] -although this might also reduce the large energy gaps illustrated in Figure 2b, promoting polymorphism.

Energy-Structure-Function (ESF) Maps
CSP is a valuable tool for predicting the crystal structures of porous organic cages and, hence, their likely porous properties. However, the predicted energy landscapes in Figure 2b convey no direct information about the likely physical properties of the various structures. Each energy landscape can contain hundreds, sometimes even thousands, of structures within the accessible energy window and each structure-that is, each point on the landscape-is associated with a different set of physical properties. Given the size and potential complexity of these landscapes, it is challenging to assess the fitness of a given molecule for a target application based on the energy landscape alone. To address this, we have developed ESF maps, where calculated properties of interest are superimposed onto the lattice energy landscape, represented here via a color-coding of the points (Figure 3). [18,29] We first exemplified this for extrinsically porous molecules, such as the hydrogen-bonded frameworks developed by Mastalerz. [30] Lattice energy landscapes for such molecules can be surprisingly complex: for example, a trigonal benzimidazalone, T2, was found to have at least four stable porous polymorphs, [18] all of which were located on the low-energy boundary, [31] or "leading edge" of the calculated energy-structure landscape (Figure 3a). Three of these polymorphs (T2-b, T2-g, and T2-d) were previously unknown, illustrating the potential for CSP to reveal hidden detail for molecules that have already been studied. The  2. a) A fundamental problem in molecular crystal engineering is that even quite similar molecules can assemble differently to give crystals with divergent physical properties. Here, three structurally related [4+6] organic cages crystallize to give a nonporous material (CC1), a material with 1D pores (CC2), and a material with 3D pores (CC3). [20] b) Crystal-structure prediction explains the crystal packing for porous cages. The experimental single-crystal structures (red) are predicted accurately (blue) using only the molecular diagram as the input. [21] The chiral crystallization preferences of these cages are explained by the large calculated energy gaps between structures A and B, C and D, and E and F. Figure 3. a) Energy-structure-function (ESF) maps allow the discovery of new porous crystals. [18] T2 has at least four porous polymorphs (T2-a-T2-d), three of which are found in "spikes" on the lattice energy landscape. No such stable regions are found for T0 or T1. b) Four stable polymorphs of T2 0.412 g cm −3 and an experimental surface area of 3425 m 2 g −1 , close to the a priori predicted values of 0.417 g cm −3 and 3230 m 2 g −1 . At the time of writing, T2-g is the lowest density molecular crystal reported in the Cambridge Structural Database.
These ESF maps can also suggest, at a glance, which molecule in a candidate set might be the most appropriate for a given application. For example, to be of practical interest for methane storage, a porous solid must deliver at least 150 v STP/v of methane-and preferably more than 180 v STP/vover a practicable storage and release cycle. ESF maps predict that T2 is the only molecule in a set of eight geometrically diverse candidates that has potentially viable structures (e.g., T2-g, Figure 3a) that might be interesting for methane storage. Methane sorption experiments for T2-g validated these predictions: this polymorph has a saturation methane storage capacity of 437.4 v STP/v at 115 K, in agreement with simulations, and a deliverable capacity calculated from sorption isotherms at higher temperatures that match the ESF map prediction of 159 v STP/v.
It may be noted that the four observed polymorphs for T2 all have calculated lattice energies that are substantially higher than a denser form, T2-0, which is the global minimum structure on the predicted energy landscape (Figure 3a). This latter polymorph has never been observed to crystallize from solution. The preferential formation of the higher-energy polymorphs T2-a-T2-d by solution crystallization can be explained by solvent stabilization calculations, [18] which showed that the solvated forms of these polymorphs are more stable than T2-0.
This CSP approach was subsequently applied to a hypothetical, extended analogue of T2, T2E, which was predicted to crystallize isostructurally to T2-g but with 2.83 nm hexagonal mesopores. Based on this, T2E was synthesized and the predicted hexagonal form, T2E-a, was crystallized from a mixture of DMF and acetone. If it can be desolvated without loss of crystallinity, T2E-a is computed to have an unprecedentedly low density of 0.303 g cm −3 and a high gravimetric H 2 delivery capacity of 14.5 wt% at 100 bar and 77 K.
It is also possible to predict more complex, second-order properties for these hydrogen-bonded molecular frameworks-for example, we produced "energy-structure-selectivity maps" that can be used to identify porous materials with the highest gas selectivity. [18] ESF maps have also been applied to porous organic cages, in this case mapping out the narrowest pore diameter to investigate the effects of adding methyl groups to the cage windows. [25] These computational predictions are revealing: in cage CC15-R, the addition of methyl groups disrupts the window-to-window packing mode, and this is apparent in the computed structure-energy landscapes (Figure 3c, left). By contrast, in a cocrystal of CC15-R, (CC3-S, CC15-R), the window-to-window packing is preserved, as highlighted by the prediction of a well-separated, unique global minimum structure (Figure 3c, right) in which the methyl groups narrow the pore diameter, as intended. This shows how ESF maps can be used in tandem with more intuitive design strategies: methyl groups in CC15-R were included purposefully to narrow the pore size in the molecule, but this had the unintended consequence of disrupting the window-to-window packing, leading to a crystal where the pore size was not, in fact, narrowed as we had planned. Both the effect on structure and the resulting effect on function are made clear by colorcoded ESF maps (Figure 3c).

Beyond Porous Solids
ESF maps are a general concept that can be applied to any physical property that can be computed from crystal structures. As such, this approach can be extended beyond porous solids. For example, in a pharmaceutical materials context, the mechanical properties evaluated for low-energy predicted structures of an active pharmaceutical ingredient can predict their compressibility and tabletability. [32] In the field of organic electronics, ESF maps were calculated for electron mobility in azapentacene molecular crystals, and a Boltzmann-like averaging function was proposed to select the most promising candidate from a set of hypothetical molecules (Figure 3d). [13a] The study of substituted pentacenes using ESF maps also demonstrates how this approach can highlight structure-property relationships, in this case quantifying the differences in average charge carrier mobilities between herringbone, stacked, and sheet-like packings. Unlike trends drawn from experimental measurements on crystals of families of related molecules, here the structure-property relationships can be observed across a large set of crystal packings of the same molecule, eliminating differences due to intrinsic molecular properties and isolating the effect of crystal packing. We expect that ESF maps will play a role is uncovering new structure-property relationships as their use is extended to more properties. There is a bright future for the selection of molecular organic crystals for a range of applications, provided that the function of interest-or a descriptor that relates to this-is computable with sufficient reliability and on an affordable timescale.

Outlook
As computational tools and hardware develop, crystal structure prediction is set to have a disruptive impact on the field of crystal engineering, which is currently based on simple, intuitive concepts of molecular assembly. Today, the dominant design strategy for both extended frameworks and molecular Adv. Mater. 2018, 30, 1704944 were isolated in the laboratory, with good agreement between prediction (red) and experiment (blue). The superior methane deliverable capacity for T2-g (159 v STP/v), suggested by ESF maps in (a), was also confirmed experimentally. c) ESF maps for organic cages reveal that methylation of the cage disrupts window-to-window packing for CC15-R, leading to loss of the unique global-minimum packing mode (left), while a (CC3-S, CC15-R) cocrystal preserves the window-to-window packing mode, leading to a material with a very narrow pore diameter (PD = 1.63 Å), as per the color-coded ESF map (right). [25] d) ESF maps of electron mobility for three azapentacene molecules (5A-5C). [13a] While the landscape for 5A (left) has multiple low-energy structures with high calculated electron mobility, 5C (right) was proposed as the favored molecule based on a Boltzmann-like probabilistic analysis. crystals is purposeful crystal engineering via structural analogues, achieved by exploiting directional covalent, coordinative, or noncovalent bonding. Indeed, these design tools have become so powerful and appealing that the design "pull" sometimes exceeds the technology "push": MOFs, in particular, have been evaluated for an enormous range of applications; in some cases, the elegant modularity of the chemistry, more than the inherent nature of these solids, seems to have driven this. As argued above, a different strategy is the selection of materials based on computational prediction of structure and functionin essence, to purposefully design only the core, target functionality into the molecular building block, and to let self-assembly do the rest. In the long view, this is an attractive approach that could lead to simple, pared-down functional materials, shorn of any superfluous structure-directing functions. There are several challenges, though, that must be tackled first for such approaches to become more widespread.
The first question raised is usually speed, or computational expense. At least for some classes of molecules, timescales for computation of structure and properties can already compete with experimental timescales. For example, computation of ESF maps for the largest hydrogen-bonded system that we studied, T2E, required around one month using national highperformance computing facilities, [33] whereas to work out a synthesis for T2E and to grow diffractable crystals, initially on the scale of a few milligrams, involved more than three months of laboratory effort. Computational selection strategies are therefore particularly attractive for large, synthetically complex molecular materials. The high cost involved with CSP for T2E was partly due to the molecular size, but also to the need to exhaustively search for crystal structures within a wide (100 kJ mol −1 ) energy window, in order to capture all potentially porous crystal structures that could be stabilized by solvent inclusion. CSP can be completed on much shorter timescales for smaller molecules, and applications where all relevant structures will be found within the usual energy range for polymorphism (≈7-10 kJ mol −1 ). For example, CSP for substituted pentacenes for organic electronics [13a] were completed in under a day for each molecule using modest computational resources; as such, higher throughput computational screening can be envisioned for some applications, where many molecular variations can be assessed computationally on relatively short timescales.
Applications of ESF maps more generally face some challenges: for example, T2E is large but it is also rigid, and the cost of CSP for more flexible molecules increases dramatically; flexible molecules often do not adopt their lowest energy conformation in their crystal structures, [34] meaning that the search for stable crystal structures might need to explore broad, strongly coupled inter-and intramolecular phase spaces. For some ESF maps, the computational expense will be defined more by the physical properties to be calculated. Properties such as pore volume are computationally very cheap, even for maps containing thousands of predicted structures, while other properties, such as gas selectivity and many electronic properties, are much more expensive. In this regard, porous molecular solids have both advantages and disadvantages: surface area, pore volume, and pore topology are cheap to calculate, but it is also necessary to consider a broad range of densities and, hence, a larger number of structures (see, e.g., Figure 3a). Also, solvent stabilization, which is currently expensive to compute, plays a particularly important role for porous crystals.
Another challenge is the de novo prediction of composition. In all the examples given here, the composition of the material was fixed, either to contain a single molecular entity or two molecules in a binary 1:1 cocrystal. In principle, CSP can be applied to an organic cocrystal system with multiple fixed stoichiometries of the components, thus allowing the comparison of lattice energies for different compositions [35] using a convex hull construction. However, this is currently too expensive for routine application. This problem has been tackled for relatively simple inorganic materials using variable-composition algorithms for predicting structures [36] and more complex systems by using "extended modules" to predict new layered materials of varying compositions [37] -but these methods are not directly transferrable to organic crystals where the molecule, with all its independent degrees of freedom is, de facto, the module. This might not, in fact, be such a limitation because our modern understanding of organic synthesis allows complex retrosynthetic analysis. That is, it might be simpler to synthesize a single molecule containing two different functionalities than it is to engineer the cocrystallization of two separate functional molecules, thus raising the twin questions of (i) will the molecules cocrystallize at all, and (ii) if so, in what stoichiometry? In this respect, organic synthesis might be the preferred route to controlling composition, particularly if we can access ESF maps to persuade us that these synthetic labors will yield the desired solid-state function.
A further, difficult question is where to start: that is, how do we choose the molecular building blocks prior to computation? In each of the systems presented here, the building blocks were selected with the intention of encoding the main physical property into the molecule itself. For example, shapepersistent organic cages have intrinsic molecular porosity: as a result, many packing modes lead to porous crystalline solids, even if the nature of that porosity depends strongly on the crystal packing, which is not intuitively predictable (Figures 1a  and 3c). Likewise, molecules such as T2 were chosen to have both structural rigidity and directional hydrogen bonding; that is, they were also "designed" to be porous although, in fact, the first polymorph of T2 to be discovered [30] (T2-a) is by no means the most intuitive, nor the most stable, and it was impossible to foresee without computation that there would be at least three other stable porous polymorphs for T2, one of which is almost half as dense (Figure 3a,b). As a third example, pentacene and azapentacenes were investigated by CSP because they are known candidates for organic electronics. [13a] This is due to the extended p-system and low intramolecular reorganization energy-that is, the energetic cost associated with accepting an electron or hole. However, the charge carrier mobilities in these materials are strongly affected by their crystal packing (Figure 3d), and hence this property cannot be predicted from the molecules in isolation. More broadly, this molecular design strategy could be extended into many other materials: for example, by selecting molecules that contain chromophores for light absorption, or that give rise to magnetic properties. This approach uses human design input to select the building blocks. One can also envisage strategies where that selection step, too, becomes driven by computation. For instance, the core molecular property of interest-let us say light absorption at a given wavelength-could be selected by an algorithmic or high-throughput [38] search prior to the CSP step to produce an initial candidate set of molecular building blocks. More ambitiously, by looping the four steps shown in Figure 1a and selecting for predicted crystals that meet the target property, we have a potential basis for the autonomous design of functional molecular crystals.
Strategies for autonomous design would demand, among other things, a substantial increase in the speed at which we can calculate-or perhaps learn and extrapolate-the ESF maps. This also raises questions about how we interpret the ESF maps. In most of the examples given here, the maps were read by humans in an empirical fashion; for example, we observed low energy "spikes" (sets of low-lying structures on the lattice energy landscape) for T2 and targeted structures within these spikes (Figure 3a). In the future, and certainly for more autonomous, looped strategies, these ESF maps will need to be interpreted by computers. The recent azapentacene study (Figure 3d) gives one indication of how this might be approached in a Boltzmann-like, probabilistic fashion. [13a] Such developments are important because many functional molecules will not give rise to large lattice energy gaps. For porous cages [21] (Figure 2b), these gaps were adventitious, while for T2, [18] low-energy regions on the crystal structure landscape were programmed in by the choice of a strong directional hydrogen bonding functionality-although it should be noted that this intuitive crystal engineering strategy does not succeed for the structural analogue, T1 ( Figure 3a). Thinking more generally, it might be advantageous to dispense with structure directing groups or, as was the case with porous cages, [21] to discover molecules where such energy gaps are intrinsic to the molecule. In the absence of strong structure directing groups, we will need to think in terms of more probabilistic design, [13a] since we will be working with complex maps that may not lend themselves to simple visual interpretations.
The integration of CSP with property predictions to create ESF maps raises a wide range of opportunities for future work. The associated challenges span several research fields including computational chemistry, informatics, retrosynthetic analysis, and areas of computer science such as algorithms and machine learning. In the shorter term, we highlight three areas of materials chemistry where ESF maps might pay early dividends. First, the area of hydrogen-bonded organic frameworks, or HOFs, has grown rapidly in recent years, [39] and there are doubtless many systems that remain to be discovered. Perhaps many of these, like T2, will be susceptible to polymorphism. We propose that ESF maps could have a valuable role to play in the selection of HOF building blocks, and that this approach might be quite general: for example, the experimentally observed porous form of a hydrogen-bonded macrocycle was shown recently to be located on the "leading edge" of the CSP landscape for that molecule. [40] An important target here is the development of methods to calculate or to approximate solvent stabilization across energy-structure landscapes with much lower computational expense. Second, the area of porous organic cages is also burgeoning, [41] and the use of ESF maps and other computational approaches [42] to choose cages for specific applications is a promising direction. Here, front-end methods to select target cages that will be synthesizable and shape-persistent are a necessary precursor. [43] Third, materials for organic electronics gain their function from a combination of their molecular properties and their extended, solid-state properties, and this is a natural area for materials selection using ESF maps; [13a] also, these molecules are frequently rigid in nature, which makes them good targets for CSP.