Toward unzipping the ZIP metal transporters: structure, evolution, and implications on drug discovery against cancer

The Zrt‐/Irt‐like protein (ZIP) family consists of divalent metal transporters, ubiquitous in all kingdoms of life. Since the discovery of the first ZIPs in the 1990s, the ZIP family has been expanding to contain tens of thousands of members playing key roles in uptake and homeostasis of life‐essential trace elements, primarily zinc, iron and manganese. Some family members are also responsible for toxic metal (particularly cadmium) absorption and distribution. Their central roles in trace element biology, and implications in many human diseases, including cancers, have elicited interest across multiple disciplines for potential applications in biomedicine, agriculture and environmental protection. In this review and perspective, selected areas under rapid progress in the last several years, including structural biology, evolution, and drug discovery against cancers, are summarised and commented. Future research to address the most prominent issues associated with transport and regulation mechanisms are also discussed.


Introduction
Some d-block metal elements, including zinc and several transition metals (iron, manganese, copper, cobalt, and molybdenum), are known to be essential to human health. These trace elements are utilised in oxygen transfer, catalysis, maintaining the stability of macromolecule, regulation of gene expression and cell signaling. To balance the beneficial as well as the potentially detrimental effects at excess level, dedicated mechanisms have been evolved to maintain systemic and cellular homeostasis of zinc and transition metal ions, in which metal transporters play a central role in controlling metal fluxes across bio-membranes. Among the known metal transporters, the Zrt-/Irt-like protein (ZIP) family is particularly important as these importers are critically involved in metal uptake and homeostasis of at least three essential metals: zinc, iron, and manganese.
The founding members of the ZIP family include the zinc-regulated transporters from Saccharomyces cerevisiae (ScZrt1/2) and the iron-regulated transporter from Arabidopsis thaliana (AtIRT1) [1][2][3]. Since 1996, new ZIPs have been identified in species across all kingdoms of life and more than 18 000 entries from 4454 species have been listed under the ZIP family (PF02535) in the Pfam database. Biochemically, the ZIPs transport divalent d-block metal ions into cytoplasm from either extracellular space or intracellular organelles/vesicles. Biologically, the ZIPs exert diverse functions, which depend on organ/tissue distribution, intracellular localisation, species, and biochemical properties. In bacteria, the ZIPs function together with the high-affinity zinc ATP-binding cassette (ABC) transporters (ZnuABC in gram-negative bacteria or AdcABC in gram-positive bacteria) to import zinc ions from the environment [4]. In fungi, the lack of ZnuABC orthologues makes the plasma membrane ZIPs critical for growth and proliferation under zinc-limiting conditions [5]. Plant ZIPs participate in iron, zinc, and manganese uptake from soil and distribution throughout the whole plant body [6]. The root-expressed AtIRT1 is a primary Fe 2+ importer, but also largely responsible for Cd 2+ absorption, particularly under iron depletion conditions [7,8]. A total of fourteen ZIPs (ZIP1-14) have been identified in the human genome. Most of the human ZIPs play key roles in systemic and cellular zinc homeostasis and intracellular zinc signaling, whereas ZIP8 and ZIP14 transport multiple metal substrates and have been proposed to participate in the homeostasis of zinc, iron and manganese [9-13]. ZIP8 has also been shown to be responsible for Cd 2+ absorption and toxicity [14][15][16]. Overview of the ZIP family and the involvement in physiological and pathological processes are available in previous reviews [17][18][19][20][21][22]. This review and perspective focuses on selected topics in ZIP research under rapid progress in recent years, as well as on the discussion of the most prominent issues to be addressed in future studies, particularly those at molecular level.

Structure
According to the Pfam database, most of the ZIP family members have a simple domain architecture with a transmembrane domain (TMD) preceded by an N-terminal soluble domain (or an unstructured fragment) and are followed by a short C-terminal sequence. It has been long predicted that the TMD, which is conserved in the entire ZIP family, has eight transmembrane helices (TM, also referred to as 'α') with both termini exposed to the extracellular space for plasma membrane ZIPs or vesicular lumen for organellar ZIPs [17]. This topology has been confirmed by the crystal structure of a bacterial ZIP from Bordetella bronchiseptica (BbZIP) [23]. As shown in Fig. 1, human ZIP4 (HsZIP4), a relatively well-characterised eukaryotic ZIP, contains several key structural domains/elements which are discussed under the following subtitles.
Among these, the TMD is the core domain which directly mediates metal transport across lipid bilayers, whereas the N-terminal extracellular domain (ECD) and the intracellular loop connecting α3 and α4 (referred to as intracellular loop 2, IL2) play regulatory roles.
The TMDthe core of the metal transport machinery An 8-TM architecture with internal symmetry The crystal structure of BbZIP reveals that the eight TMs form a tilted helix bundle composed of two circles -α2, α4, α5 and α7 form the transport pathwaywhereas α1, α3, α6 and α8 form an outer circle facing membrane lipids (Fig. 2). Two pairs of inverted repeats have been revealed by the structure. When α1-α3 are rotated by 180 degrees through an axis parallel to the membrane surface, they can be overlapped with α6-α8. α4 and α5 can also be correlated through the same operation. Inverted repeats have been observed in many membrane proteins including secondary transporters, which actively transport the primary substrate uphill by energetically coupling downhill transport of the secondary substrate, and some channels and the repeated units are thought to be the result of gene duplication [24]. Taking advantage of the inverted repeats, people have been able to generate alternative conformations for secondary transporters by using repeat-swap homology modeling [25][26][27][28]. Applying this computational approach on ZIP structures may help to understand transport mechanism and conformational plasticity/dynamics.
A binuclear metal centre (BMC) in the middle of the transport pathway The crystal structure of BbZIP unexpectedly revealed two bridged metal binding sites (Fig. 2, referred to as the M1 and the M2, respectively) composed of metal chelating residues primarily from α4 and α5 [23]. A BMC is often observed at the active site of metalloenzymes or the soluble portion of metal transporters [29,30], but rarely in the middle of the transport pathway. A survey of over 17 000 ZIPs indicated that a BMC is present in many ZIPs from a variety of species, suggestive of an important role in ZIP function [31]. Mutagenesis and transport assay of HsZIP4 showed that the M1 is absolutely required for transport, whereas eliminating the M2 or occupying the M2 by a lysine residue only reduced but did not completely abolish transport activity. Eliminating the M2 of HsZIP4 did not affect substrate preference either. Instead, it appears to be involved in maintaining activity of HsZIP4 in a broad pH range [31]. BbZIP crystal-soaking experiments further demonstrated that, compared with the M2, the M1 is more accessible to solvent and has a higher affinity toward metal substrate. Taken together, these data support the notion that the M1 functions as the authentic transport site whereas the M2 may play certain auxiliary roles, presumably by modulating the metal-binding properties of the M1 and/or acting as a dispensable secondary transport site for higher transport capacity. The essential role of the M1 in other ZIPs has also been reported [32][33][34][35]. In some ZIPs, a lysine residue replaces a metal-chelating residue at the M2, which likely prevents binding of the second metal ion. A study on ZIP2 showed that the lysine residue occupying the M2 is critical for transport activity [34]. Mutation of this residue led to a drastically changed pKa of a neighbouring glutamate residue in the M1.
Functional characterisation of the M2 in other ZIPs will likely provide additional insights into the molecular mechanism of functional diversity of the ZIP family.

Metal transport pathway
A metal transport pathway can be visualised in the BbZIP structures where S106 on α2 appears to be a gating residue at the pore entrance (Fig. 3) [23]. This position is often occupied by a histidine residue in many eukaryotic ZIPs, of which the importance has been evidenced by mutagenesis studies [23,35]. In AtIRT1, metal-chelating residues at the pore entrance, particularly those from the extracellular loop between α2 and α3, have been shown to play a role in determining substrate specificity are therefore thought to be involved in initial metal-binding [33]. Between S106 and the BMC, several hydrophobic residues form a tight seal preventing metal ions (naked or hydrated) Fig. 1. Structural model of human ZIP4. The ECD and the TMD were built by homology modeling using the SWISS-MODEL server. The orientation of the ECD relative to the TMD was chosen to allow the ECD dimer and the TMD dimer to share the same pseudo two-fold axis perpendicular to the membrane. Dimerisation of the TMD is adapted from a proposed computational model (ref. [38]). The protein is depicted in cartoon and surface mode with one monomer in green and the other in blue. The domains (ECD, HRD, PCD and TMD) in the green molecule and the IL2 in the blue molecule are labeled. The red arrows indicate the flow of zinc ions through the proposed transport pathway where the dashed line indicates the blocked route at the extracellular side in the inward-facing conformation. The black circles indicate the binuclear metal center (BMC) in the TMD which is occupied by two metal ions (red balls). The histidine-rich segments in the ECD or in the IL2 are indicated by 'HHH' and the 'LQL' motif essential for ZIP4 endocytosis in the IL2 is shown as well. The electrochemical gradient of zinc ions across the plasma membrane (gray area) is indicated by the red wedge. from passing through. To allow a charged metal substrate to flow in and reach the BMC through a distance of more than 8 Å in a hydrophobic environment, a global conformational change (or a certain large scale and significant dynamics) is required. In contrast, the route of metal release from the BMC to the other side of the membrane is wide open and filled with water. Along the metal release route, metal ions are found coordinated with conserved metal-chelating residues (Fig. 3). These residues, together with the BMC, form a metal relay rarely seen in other metal transporters. A histidine-rich segment located in the IL2 was not resolved in the crystal structures, but it may participate in metal release according to its location. The exact functions of the histidine residues in the IL2 and the metal-binding sites along the metal release route are yet to be clarified.

Inward-facing conformation and implications on transport mechanism
Given that the BMC is exposed to the cytoplasm but inaccessible from the extracellular space, the conformation observed in the crystal structures of BbZIP has been referred to as the inward-facing conformation (IFC). As BbZIP was crystallised in lipidic cubic phase, which forms a continuous bilayer and is as such a better mimic of the biological membrane than detergent micelles, the blocked transport pathway at the extracellular side likely represents a native or native-like state. One unsolved challenge critically relevant to transport mechanism is how a ZIP transporter changes conformation to expose the BMC to the extracellular space. Two modes can be proposed (Fig. 4). In the first mode, when the BMC accepts metal substrate from the extracellular space, the route for metal release is concomitantly closed by interactions of the residues previously well apart in the IFC. This mode aligns with the alternating access mechanism utilised by carriers, which alternatingly expose the transport site to either side of the membrane but never allow it to be accessible from both sides at the same time. In the second mode, through certain significant dynamics, the BMC is transiently exposed to both sides so that it is located in the middle of a pore with two ends opening simultaneously, a characteristic of channels. Cell-based metal transport assays using either radioactive substrate or metal-responsive fluorescence dye on the ZIPs from E. coli to multiple eukaryotes have shown Michaelis-Menten kinetics [1,3,23,31,32,[36][37][38][39][40][41], supporting the carrier mode. However, the study of BbZIP reconstituted in proteoliposomes indicated that the transport is non-saturable, which supports the channel mode [42]. The proteoliposome study also showed that the transport is a passive and electrogenic process, which is consistent with the channel mode but would also be compatible with the carrier mode when the transporter conducts facilitated diffusion down the electrochemical gradient. Nevertheless, the barrier between the extracellular entrance and the BMC must be removed in either mode, which necessitates structure determination of alternative conformation(s) to establish a complete transport cycle. Although uncommon, members in the same transporter family may utilise distinct transport mechanisms. The known examples include the chloride channel family in which most of the family members are chloride ion channels, whereas the rest function as Cl − /H + antiporters, mediating Cl − flow against its gradient [43], and the cystic fibrosis transmembrane conductance regulator (CFTR), which belongs to the ABC transporter superfamily and is the only ABC transporter known to function as a passive chloride ion channel rather than an active carrier. Although the structure of human CFTR is very similar to those of the other ABC transporters [44], it has been proposed that the cytoplasmic-side gate is broken in CFTR so that the transport pathway has both ends open when the protein is in a conformation equivalent to the outward-facing conformation (OFC) of a classical ABC transporter [45,46]. Can this happen to the ZIP family, so that while most of the family members utilise the alternating access mechanism, the other family members have an open metal release route in the OFC? Structures in differential functional states would provide key evidence to prove or disprove the proposed modes.
Another fundamental question related to transport mechanism is what driving forces there are to power the ZIPs to move metal cations across the membrane. Different and even controversial mechanisms have been proposed on different family members. It has been suggested that ZIP2, ZIP8 and ZIP14 co-transport zinc ions with bicarbonate, because metal transport via these ZIPs is accelerated by addition of bicarbonate [47-49] and blocked by an anion transport inhibitor (4,4 0 -diisothiocyanatostilbene-2,2 0 -disulfonic acid, DIDS) [48,49]. As ZIP8-and ZIP14-mediated transport is independent from electrical gradient across the plasma membrane, the substrate is believed to be an electroneutral complex (M 2+ : HCO À 3 = 1 : 2) [50,51]. Recent studies of ZIP2, however, showed that the transporter prefers a low pH, which is not consistent with bicarbonate co-transport, although it does not co-transport proton with zinc ions either [34,36]. ZIP4 was shown to be a Zn 2+ /H + symporter [52], which is not supported by a recent work showing that ZIP4 transport activity is not affected by pH in the extracellular environment [31]. For BbZIP, a recent study suggested that partially hydrated metal ions may be the entity being transported and water molecules play important roles in facilitating metal release from the BMC [53]. This claim is primarily based on the observation that the residues seemingly buried in the IFC can be readily labeled by water-derived hydroxyl radical. Accordingly, a water-filled channel should be present and it should also be wide enough to allow passage of partially hydrated metal ions [53]. Alternatively, this result may support the presence of alternative conformation(s). The observed unchanged water accessibility of the residues in the middle of the transport pathway (such as M99) upon zinc binding may suggest that the solvent exposure of these residues is independent from conformational states of the transporter (Fig. 4).

The ECDa variable accessory domain
The N-terminal extracellular portion (or luminal portion for vesicular ZIPs) is highly variable. Many ZIPs, particularly those from prokaryotes, only have a short, less conserved and often unstructured segment, whereas some eukaryotic ZIPs have evolved a large and folded ECD conserved in certain subfamilies. One example of the latter is ZIP4, which is a representative LIV-1 subfamily member essential for zinc absorption from diet. Two reasons make ZIP4-ECD attractive in research: firstly, half of the mutations causing a lifethreatening recessive autosomal disorder, Acrodermatitis enteropathica, occur in this domain [54]; secondly, ZIP4-ECD is cleaved from the full length protein upon zinc deficiency, implying a role in zinc homeostasis-related regulation [55]. Mutagenesis and transport assay have shown that deletion of ZIP4-ECD resulted in a loss of transport activity by 75% [38], indicating that ZIP4-ECD is a functionally important accessory domain.
The crystal structure of the ECD of ZIP4 from Pteropus alecto (black fruit bat, pZIP4-ECD) provides critical structural insights into this domain [38]. The purified pZIP4-ECD forms a homodimer in solution, which was confirmed by the crystal structure where two structurally independent subdomains were identified in each monomer. The intertwined dimerisation is almost exclusively mediated by the C-terminal 'PAL' motif-containing domain (PCD) where the 'PAL' motif, which is universally conserved in ZIP4 orthologues, is located at the centre of the dimerisation interface. Sequence comparison of human LIV-1 proteins showed that the PCD is also present in the other LIV-1 subfamily members (ZIP5/6/8/10/12/14, except for ZIP7 and ZIP13), and therefore the 'PAL' motif may similarly mediate dimerisation in these ZIPs as well, which was supported by cysteine-crosslinking of ZIP14-ECD [38]. According to the ZIP4 structural model shown in Fig. 1, a histidine-rich loop in the PCD, which was not solved in the crystal structure Fig. 4. Conformational switches and dynamics. The transporter is depicted as two pieces of blue blocks, and metal ions and water molecules are drawn as black and red balls, respectively. In the IFC (left), the BMC (two neighbored black balls) is exposed to the cytoplasm (IN), whereas the route from the BMC to the extracellular space (OUT) is blocked by close interactions (drawn as red crosses) which must be broken in the alternative conformation(s) (right) to allow completion of a transport cycle. In the carrier mode, the transporter undergoes a global conformational change to expose the BMC to the extracellular space and the route to the cytoplasm is simultaneously blocked by newly formed close interactions. In the channel mode, the transporter experiences certain large dynamics so that the BMC is transiently accessible from both the extracellular space and the cytoplasm. In either mode, the residues in the middle of the pathway (such as M99) may be accessible to solvent in both IFC and the hypothetical alternative conformations. due to high flexibility, appears to be just above the entrance of the transport pathway in the TMD. This unstructured and highly dynamic loop binds two zinc ions with micromolar affinity [56]. Replacing the histidine residues by alanine resulted in a modest decrease of zinc transport activity, indicating that this loop plays a role in transport. A topologically equivalent histidine-rich loop is also present in ZIP6 and ZIP10 with much more histidine residues, but their contribution to zinc transport is unknown. The N-terminal subdomain of ZIP4-ECD, the helix-rich domain (HRD), is barely involved in dimerisation. The 'linker' or loop connecting the HRD and the PCD adopts distinct conformations in the two monomers, leading to different orientations of the HRD relative to the PCD in the crystal structure. In the ZIP4 structural model ( Fig. 1), it seems that the HRD is hanging over the membrane surface and would have little chance to directly interact with the TMD. However, if the flexibility of the linker connecting the two subdomains is taken into account, such an interaction is still possible. Deletion of the HRD resulted in reduction of transport activity by 50%, but how the HRD exerts its function is still unclear.
The origins of the HRD and the PCD are unknown. Blast search shows that these domains are only present in animals and emerged approximately 680 million years ago when ParaHoxozoa diverged from Eumetazoa. The HRD and the PCD appear to emerge at the same time, because Trichoplax sp. H2, which is a basal multicellular animal lacking organs or internal structures and one of the earliest species with expression LIV-1 ZIPs [57], possesses orthologues of ZIP7/13 (which do not have either HRD or PCD), ZIP10/14 (with the PCD only) and ZIP12 (with both the HRD and the PCD). Clarification of the origins of these accessory domains may help to better understand their functions.
Eukaryotic ZIPs may have distinct N-terminal ECDs. ZIP7 is an endoplasmic reticulum (ER)-residing ZIP and does not have a PCD-like subdomain or invariable cysteine residues forming disulfide bonds as observed in ZIP4-ECD. Rather, it contains tens of clustered histidine residues in its long and probably unstructured ECD. The percentage of histidine residue in the ECD of human ZIP7 is >31% [58], which is several times higher than any other human ZIPs. This feature is conserved in ZIP7 orthologues from animals and plants, but not in YKE4 from Saccharomyces cerevisiae, a functional equivalent of mouse ZIP7 [59]. Given the putative presence of multiple zinc-binding sites, it has been proposed that ZIP7-ECD, which is exposed to the endoplasmic reticulum (ER) lumen, may play a role in maintaining zinc homeostasis in the ER by buffering zinc [58]. A fraction of fungal ZIPs has a large ECD. In Candida albicans, an opportunist pathogen, Zrt1 interacts with a periplasmic protein Pra1 to scavenge zinc ions from the environment, which is crucial for pathogen survival in the host [60]. This interaction, which has been proposed as being mediated through Zrt1's ECD containing a number of histidine and cysteine residues, seems to be a unique mechanism of zinc acquisition for some fungi.
The IL2a highly variable cytoplasmic loop acting as a multifunctional regulatory hub The presence of a long and histidine-rich IL2 appears to be common in the ZIP family, although sequence identity of the IL2 is very low, even among the members in the same subfamily or close orthologues from different species. The IL2 is disordered in the crystal structure of BbZIP, which is in line with the structure prediction that this segment is intrinsically disordered. A direct zinc-binding to the IL2 of BbZIP was evidenced by the drastically reduced labeling rate by hydroxyl radicals in the presence of submillimolar of zinc ions [53]. The IL2 of HsZIP4 has been extensively studied. The isolated IL2 peptide of HsZIP4 is unstructured as revealed in an NMR study [61] and the highly dynamic nature of the IL2 was also manifested by partial proteolysis of purified full length HsZIP4 [62]. The histidine residues in the IL2 bind zinc ions with nanomolar affinity, and zinc titration in the NMR study showed that, although the IL2 binds two zinc ions, there is no defined manner of zinc-binding, suggesting that the zinc-binding sites are highly dynamic and undergoing a fast equilibrium among different conformers [61]. These results support the notion that the IL2 functions as a zinc sensor which has been shown to be essential for zinc-induced endocytic degradation when cells are exposed to tens to hundreds of micromolar of extracellular zinc ions. In this post-translational regulation mechanism, increased cytosolic zinc ions saturate the IL2 zinc-binding sites, which triggers ubiquitination of the exclusive lysine residue in the IL2 for later proteasomal degradation. A different mechanism has been proposed for AtIRT1, where the clustered histidine residues in the IL2 bind non-substrate metals when the plant faces a high concentration of non-iron transition metal ions. Then, the IL2 is phosphorylated by a protein kinase CIPK23, which facilitates poly-ubiquitination of two lysine residues in the IL2 required for endosomal sorting and vacuolar degradation [63]. Other than endocytic degradation, zinc ions at a low micromolar concentration lead to a zinc-dependent endocytosis of HsZIP4 without causing ubiquitination-dependent degradation. In this alternative post-translational regulation mechanism, the transport site functions as the zinc sensor and a putative global conformational switch induced by zinc-binding at the BMC alters the conformational state of the 'LQL' motif in the IL2 so that it can be recognised by the endocytic machinery [62]. As such, HsZIP4 acts as a transceptor, which exerts both transport and sensing functions. As a matter of fact, the term transceptor has been applied to other ZIPs, including AtIRT1 [63] and ScZrt1 [64], indicating that having a sensing function is a common feature for many, if not all, ZIPs, just like many nutrient transporters [65]. For mouse ZIP4, a conserved 'HxH' motif (x means any residue) in the extracellular loop connecting α2 and α3 is also involved in zinc-sensing of zinc-induced endocytosis [66]. Although the 'LQL' motif is highly conserved in ZIP4 orthologues, it is only present in ZIP12, a close homologue of ZIP4, but not in other ZIPs, indicating that this mechanism is specific for ZIP4 and probably ZIP12 as well. Indeed, ZIP1 utilises a canonical dileucine motif in the IL2 for endocytosis [67], whereas the similar sequence in the IL2 of HsZIP4 does not play such a role [62]. The IL2 also participates in activity regulation. Phosphorylation of the serine residues (S275 and S276) in the IL2 of human ZIP7 drastically increases transport activity so that zinc ions are released from the ER to act as signaling molecules [68]. Remarkably, many phosphorylation sites identified in proteomics studies are located in the IL2 of different human ZIPs [69], but the biological significance is yet to be clarified. A recent study showed that transport activity of AtIRT1 is inhibited by a peripheral membrane protein EHB1 through the interaction with the IL2 [70]. Collectively, with the diverse functions and multiple post-translational modifications, the IL2 surely acts as a key element regulating trafficking, degradation, and transport activity of the ZIPs. One unsolved issue is the structural and biochemical basis of the above-mentioned mechanisms, which is crucial for better understanding the molecular underpinnings of ZIP regulation.

Dimerisation
There is a consensus that the ZIPs form homodimers or heteromers, which is supported by the studies of bacterial and human ZIPs [42, [71][72][73]. Unexpectedly, the crystal structure of BbZIP only revealed a monomeric state, which seems to be the result of a fast equilibrium between monomer and dimer in detergent micelles [23]. Although the ZIP4-ECD structure already indicated that ZIP4 dimer is the basal functional unit, the structure of a dimeric TMD is still lacking. A computation-based ZIP4 dimerisation mode has been proposed [38] and adapted to generate the structural model shown in Fig. 1, but it needs to be further verified biochemically and structurally. In addition, the structural and functional significance of dimerisation is unknown. For carriers utilising the elevator mode, which is one of the commonly used alternating access modes [74], the way of oligomerisation is a key factor determining how the transporter switches between conformational states within one transport cycle.

Evolution
On average, prokaryotes only possess a couple of ZIP genes per genome, whereas eukaryotes have much morein higher species, the number of ZIP gene per genome can be more than ten or even close to twenty. The large number of ZIP genes reflect functional diversity in eukaryotic species, but the exact functions of each family member in a specific species are far from being established, which can be facilitated by clarification of the evolutionary relationship among family members.
An early work divided the ZIP family into two subfamilies: subfamily I including ZIPs from plants and fungi, and subfamily II including ZIPs from mammal and nematode [17]. It was then proposed that eukaryotic ZIPs should be classified into four subfamilies, including the previous two subfamilies and two additional ones [75]. By applying this classification to the fourteen human ZIPs, HsZIP9 belongs to the ZIPI subfamily, while HsZIP1/2/3 are in the ZIPII subfamily and HsZIP11 is in the GufA subfamily, whereas the LIV-1 subfamily consists of most of the human ZIPs, including HsZIP4/5/6/7/8/10/12/13/14. It is notable that GufA is named after a prokaryotic protein with unknown function in Myxococcus xanthus (MxGufA) [75]. A protein similarity network analysis of the ZIP family found that HsZIP11 is within the same group with several bacterial ZIPs, including MxGufA [76]. A later study on the murine ZIP11 confirmed the zinc-transporting function, and the phylogenetic analysis conducted in the same report also demonstrated the close relationship of multiple eukaryotic ZIP11 orthologues with prokaryotic ZupT proteins [77]. Therefore, ZIP11 appears to have a direct prokaryotic ancestor, which is further supported by a recent analysis of the ZIPs from representative species [78]. The authors found that there is only one orthologue group consisting of members from prokaryotes, fungi and metazoa. Two eukaryotic ZIPs in this group, HsZIP11 and a vacuole-located Zrt3 from Candida albicans, share a low sequence identity but are closer to two groups of prokaryotic ZIPs, respectively, suggesting that ZIP11 and Zrt3 may evolve from two distinct subfamilies of prokaryotic ZIPs.
To further explore the diversity of prokaryotic ZIPs and the relationship between prokaryotic and eukaryotic ZIPs, a new phylogenetic analysis including 199 ZIPs from all kingdoms of life is conducted in this work. The sequences were initially retrieved from the ZIP family (PF02535) in the Pfam database and crossreferenced in the UniProt database. This collection includes 77 eukaryotic ZIPs from Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Oryza sativa, Saccharomyces cerevisiae and Candida albicans, and 122 prokaryotic ZIPs randomly picked up from the Pfam database. Although the prokaryotic ZIPs account for less than 25% of total ZIPs in the PF02535 family, overrepresentation of prokaryotic ZIPs in this analysis puts more weight on these ZIPs which were not thoroughly analysed in the previous phylogenetic studies.
As shown in the unrooted circular phylogenetic tree (Fig. 5), the ZIPs involved in this analysis can be put into three regions, primarily for convenience of discussion. Although the branches in Region III are well separated from those in the other two, the boundary between Region I and Region II is vague as some small branches can be assigned to either one of them. Below, the three regions are discussed separately and the corresponding phylogenetic trees are shown in Figs 6-8.
The majority of the ZIPs in Region I are prokaryotic ZIPs (Fig. 6), including ZupT from E. coli (EcZupT), which has been established as a divalent metal transporter with broad substrate specificity [32,79] and the founding member of the GufA subfamily MxGufA, of which the function is unknown. EcZupT and MxGufA are in distinct branches where a couple of eukaryotic ZIPs are also residing. ZTP29 from Arabidopsis thaliana (AtZTP29) is in the same branch with EcZupT and several other prokaryotic ZIPs. A previous study has shown that AtZTP29, an ER-residing zinc transporter involved in relieving ER stress [80], has close orthologues in plants, but is distant from other ZIPs of A. thaliana [81], supporting the finding that AtZTP29 orthologues form a subfamily orthologous to the ZupT subfamily. The GufA branch can be further divided into two sub-branches, in one of which MxGufA and BbZIP are located. It has been shown that BbZIP is a functional Zn 2+ and Cd 2+ transporter [42]. HsZIP11, together with two fly and worm orthologues, are in the other sub-branch containing additional prokaryotic ZIPs. In some bacteria, ZupT and GufA co-exist in the same species, implying that they may have different biological functions [78]. When the amino acid compositions at the BMC are compared between ZupT and GufA, it is found that the ZupTs share two almost invariant metal binding motifs -'HNFPEG' in α4 and 'HNIPEG' in α5 (Fig. 6), whereas for the GufAs, the corresponding motifs are 'HNhPEG' (where h is a hydrophobic residue) and 'QNhPEG', respectively. It is very distinguishable that the histidine residue in the second motif of the ZupTs is replaced by a glutamine residue in the GufA proteins without exception. If α4 and α5 are indeed derived from gene duplication (as discussed in the Structure session), the ZupT subfamily with two motifs which are almost identical may represent a more ancient version than the GufA subfamily. The ZupT branch is adjoined by a group of prokaryotic ZIPs in a branch further away from the GufA branch. None of these proteins have been functionally characterised, but the two motifs at the BMC are almost the same as those in the ZupTs, so this group is referred to as 'ZupT-like'. Besides ZIP11 and AtZTP29, an additional eukaryotic ZIP in Region I is zinc transporter 50 (ZTP50), a putative zinc transporter with an unknown function. ZTP50 and a prokaryotic ZIP form a branch distinct from the ZupT subfamily or the GufA subfamily. As the motif in α4 'HSF(G/A)EG' (the fourth position is occupied by either G or A) is very different from that in either the ZupT or the GufA subfamilies, we tend to assign this small group as an independent subfamily. Consistently, ZTP50 and several other algae ZIPs are shown to form a cluster close to but distinguishable from the GufAcontaining cluster [76]. Due to low sequence identity to any subfamily, five prokaryotic ZIPs in Region I are categorised as unclassified.
Most of the eukaryotic ZIPs in Region II are within the LIV-1 subfamily where plasma membrane ZIPs and vesicular ZIPs are located in two separate branches (Fig. 7). The human plasma membrane LIV-1 members (HsZIP4/5/6/8/10/12/14) can be further divided into three subgroups, partially based on the sequence and domain layout of the ECD [38]: the ECDs of HsZIP4 and HsZIP12 contain both the HRD and the PCD; HsZIP5, HsZIP6 and HsZIP10 have no HRD but a PCD which contains a long (89-232 residues), histidine-rich (for ZIP6 and ZIP10) and a presumably unstructured loop which is supposed to hang over the TMD (Fig. 1); HsZIP8 and HsZIP14 also only have a PCD in the ECD, but the unstructured loop is short and lacks histidine residues. Consistent with this classification, ZIP4/12, ZIP5/6/10 and ZIP8/ 14 are classified in three Ohnolog families, which suggests that the members within each group are the results of genome duplication in vertebrates [82]. Two human intracellular LIV-1 proteins, HsZIP7 and HsZIP13, which reside in the ER and the Golgi complex, respectively, are in the vesicular ZIP branch, together with yeast ER-residing YKE4. Another noteworthy eukaryotic ZIP is IAA-alanine resistance protein 1 (IAR1) from A. thaliana, which is in a subbranch adjacent to the vesicular ZIP branch. IAR1 is an ER-residing metal transporter and its function can be substituted by mouse ZIP7, so IAR1 should also be classified as a vesicular LIV-1 protein [83]. One notable difference between the plasma membrane LIV-1 proteins and their intracellular counterparts is that the ECD of the former contains a PCD-mediating dimerisation, whereas the ECD of the latter does not. Sequence analysis of the LIV-1 proteins reveals that the two motifs at the BMC are highly conserved with 'HNFxDG' in α4 and 'HExPHE' in α5. Both motifs can be further expanded to 'DGxHNFxDG' and 'HExPHExGD', which are unique for the LIV-1 subfamily among all the eukaryotic ZIPs. The latter motif was referred to as the zinc-dependent metalloprotease motif when the LIV-1 subfamily was discovered [84,85]. Two significant exceptions are HsZIP8 and HsZIP14, where the first histidine residue in the second motif is replaced by a glutamate. Given HsZIP8 and HsZIP14 are known multi-metal transporters, it is likely that this substitution is linked to distinct substrate specificity. A recent study showed that mutations of metal-chelating residues in the BMC altered the substrate preference of ZIP13 in Drosophila melanogaster [86]. Remarkably, a group of prokaryotic ZIPs are in the same branch with the eukaryotic LIV-1 proteins. These bacterial ZIPs have very similar motifs at the BMC as seen in the LIV-1 subfamily -'HNFxDG' in α4 and 'HExP(Q/H)E' in α5which can also be expanded to 'DGxHNFxDG' and 'HExP(Q/H)ExGD'. Although none of these prokaryotic ZIPs have been functionally studied, given the overall sequence similarity and almost identical signature motifs, this group of prokaryotic ZIPs are named as 'LIV1-like' proteins. In Region II, some other prokaryotic ZIPs are evolutionarily distant from the LIV-1 or the LIV1-like proteins. Also, they do not share conserved residues in the BMC, suggesting they may be composed of multiple smaller groups with a distinct sequence and functional features. Therefore, these ZIPs are grouped as unclassified in this work. Two fungal Zrt3 are also not close to either the LIV-1 or the LIV1-like subfamilies. Zrt3 from Saccharomyces cerevisiae (ScZrt3) is a vacuole-localised zinc transporter responsible for zinc release from storage in vacuole to cytoplasm [87]. Although ScZrt3 is a proven vesicular ZIP, it has fairly low sequence similarity with the LIV-1 intracellular ZIPs, and the motifs at the BMC are also not similar to those in the LIV-1 subfamily -'HKFPEG' in α4 and 'HNFVEG' in α5. Accordingly, these two fungal ZIPs should be classified as an independent subfamily.
Eukaryotic ZIPs in Region III are localised in three branches of the phylogenetic tree (Fig. 8). The Fig. 6. Phylogenetic tree of the ZIPs in Region I. The ZIPs discussed in the main text are labeled. BbZIP is colored in blue, and the other ZIPs are colored in the same manner as in Fig. 5. For each subfamily, the HMM logo of the metal binding motifs in α4 and α5 was generated at Skyline (https://skylign.org/). Some prokaryotic ZIPs are within the category of unclassified (Un). AtZTP50 and a prokaryotic ZIP (indicated by black dots) cannot be assigned to any subfamily. The branch lengths are labeled on top of each branch.
founding members of the ZIP family, ScZRT1 and AtIRT1, are in the ZIPI subfamily where all the members are from plants or fungi. The motifs at the BMC are 'HSxxIG' and 'HQxFEG', respectively, and a highly conserved isoleucine residue in the first motif replaces a carboxylic acid residue (aspartate or glutamate) conserved in other ZIPs, differentiating the ZIPI subfamily from the others. The ZIPII subfamily includes three human ZIPs -HsZIP1/2/3, and their orthologues from plants, worm and insect. The motifs at the BMC are 'H(S/E)xFEG' and 'HKxxxA', respectively. With a positively charged lysine residue at the second position of the motif in α5, the ZIPII proteins are unlikely to have the second metal-binding site. Two prokaryotic ZIPs are in the same branch with the ZIPII subfamily, which is also close to the branch of the ZIPI subfamily. A Blast search using their sequences as seeds found more bacterial ZIPs (primarily in the proteobacteria phylum) with conserved motifs at the BMC -'HSxxxG' in α4 and 'HKxxES' in α5, which appear to have a mixed feature of the ZIPI and the ZIPII subfamilies. This small group of bacterial ZIPs are therefore referred to as 'ZIPI/II-like' proteins. The third eukaryotic ZIP-containing branch in Region III is composed of HsZIP9 and its orthologues are in worm and insect, as well as ATX2 from yeast which is a Golgi-residing manganese transporter from Saccharomyces cerevisiae [88]. ZIP9 has been identified as a plasma membrane androgen receptor which activates G protein-involved cell signaling [89][90][91]. ZIP9 was previously classified as a member of the ZIPI subfamily, but the recent phylogenetic studies have found that HsZIP9 and its orthologues are not clustered with the ZIPI proteins [76,77], which is also Fig. 7. Phylogenetic tree of the ZIPs in Region II. ScZrt3 and its orthologue in Candida albicans (indicated by black dots) cannot be assigned to any subfamily. Some prokaryotic ZIPs are within the category of unclassified.
found in this work. In addition, the motifs at the BMC ('HxxxDG' and 'HKxPxx') are distinct from those of the ZIPI subfamily. As such, we propose to name an independent 'ZIP9' subfamily. Similar to the ZIPII proteins, the presence of a lysine residue in the second position of the motif in α5 likely excludes the possibility of a second metal-binding site. In the ZIP9containing branch, there is a group of prokaryotic ZIPs (mostly from the phylum of bacteroidetes) with the motifs of 'H(S/A)xxEG' and 'HxIPxx' in α4 and α5, respectively. We therefore name them as 'ZIP9like' proteins. Compared with the ZIPs in the other two regions, most of the Region III ZIPs may have only one metal-binding site in the BMC. A lysine residue in the M2 would prevent metal binding in the subfamilies of ZIPII, ZIPI/II-like, and ZIP9, as well as some members of the ZIP9-like subfamily. A bulky hydrophobic isoleucine residue in the ZIPI subfamily may interfere with metal-binding and an altered metal coordination geometry at the BMC can be anticipated.
In summary, this phylogenetic analysis reveals new insights into the evolutionary relationship among family members, particularly between prokaryotic and eukaryotic ZIPs. As ZIP9, ScZrt3, AtZTP29 and AtZTP50 cannot be assigned to any of the four established subfamilies (ZIPI, ZIPII, GufA and LIV-1), we propose that these ZIPs should be the representative members of new subfamilies. Importantly, these

5817
The FEBS Journal 288 (2021) 5805-5825 ª 2020 Federation of European Biochemical Societies eukaryotic ZIPs appear to have the corresponding prokaryotic orthologues: (a) the ZIP9 and the ZIP9like proteins are in the same branch; (b) AtZTP50 and a prokaryotic ZIP are in an isolated branch sandwiched by the ZupT subfamily and the GufA subfamily, which are two major prokaryotic subfamilies; (c) AtZTP29 belongs to the ZupT subfamily; and (d) the Zrt3 orthologues are more similar to some unclassified prokaryotic ZIPs than any eukaryotic proteins in other subfamilies. As for the four established eukaryotic subfamilies, this work reveals that (a) the LIV-1 proteins have a higher degree of sequence identity with the LIV1-like proteins when compared with other eukaryotic ZIPs; (b) the proteins in the ZIPI and the ZIPII subfamilies are within the same major branch with the prokaryotic ZIPI/II-like proteins; (3) ZIP11 is an orthologue of the prokaryotic GufA, which is consistent with the previous reports. Collectively, it seems that nearly all the eukaryotic proteins analysed in this work have their close and corresponding prokaryotic sisters, which suggests that the eukaryotic ZIP subfamilies have evolved from distinct prokaryotic ancestors which diverged from the last universal common ancestor. If it is the case, given the fact that prokaryotic species have only a couple of ZIP genes per genome whereas eukaryotic species have far more ZIPs belonging to several different ZIP subfamilies, one would conclude that horizontal gene transfer likely happened from prokaryotes to eukaryotes multiple times and then gene duplications in eukaryotes further increased the number of ZIP gene per genome to today's level.

Implications on drug discovery against cancer
It has long been known that the ZIPs are closely associated with cancer [92-94], but the patterns of ZIP dysregulation are quite different, which presumably reflects the complicated roles of zinc in carcinogenesis, tumorigenesis, metastasis and tumour resistance. Studies on prostate cancer cells showed that ZIP1, ZIP2, and ZIP3 are downregulated, which is accompanied with a reduced amount of total zinc in tumour tissues [95][96][97][98][99]. Decreased zinc level was also observed in pancreatic adenocarcinoma, which was attributed to downregulation of ZIP3 [100,101]. In contrast, ZIP4 has been shown to be upregulated in pancreatic cancer cells and plays a key role in the proliferation of cultured cells and xenografts in animal models [102,103], which was reported to be mediated through the IL-6/ STAT3 pathway [104]. Upregulation of ZIP4 has also been reported in other types of cancers, including hepatocellular carcinomas [105], oral squamous cell carcinoma [106], ovarian cancer cells [107], lung cancer cells [108], nasopharyngeal carcinoma [109], and glioma [110]. Given that ZIP4 is a high-affinity zinc transporter in charge of zinc absorption from diet and zinc reabsorption from urine, abnormally upregulated ZIP4 would provide cancer cells with an adequate amount of zinc to fulfill the needs for rapid growth and proliferation, whereas ZIP4 knockdown reduces zinc deficiency-induced apoptosis of cancer cells [111]. Broad expression in a variety of cancers and very specific expression in normal tissues (small intestine and kidney) make ZIP4 an attractive anti-cancer target. In addition, exosomal ZIP4 was shown be a diagnostic biomarker of pancreatic cancer [112]. Since the 1990s, ZIP6 was initially identified to be an estrogenresponsive factor upregulated in breast cancer tissues and a biomarker of this cancer [113][114][115][116][117]. Upregulated ZIP6 was also reported in many other types of cancer cells, including prostate cancer [118], pancreatic cancer [119], hepatocellular carcinoma [120], oesophageal squamous cell carcinoma [121], gastric adenocarcinoma [122], and colorectal cancer [123]. ZIP6 and ZIP10 form a heteromer essential for cell proliferation [73,124], and ZIP10 by itself has also been implicated in cancers [125,126]. ZIP7 is a key player to maintain ER homeostasis [127][128][129], and pharmacological knockdown ZIP7 leads to ER stress and a disrupted Notch signaling pathway which is implicated in cancer [130]. It has been proposed that ZIP7 can be targeted against breast cancer [131,132], and the recent studies revealed the critical roles of zinc ions and ZIP7 in Bcell development [133,134]. Although ZIP7 is absolutely required for B-cell maturation, a partial loss of ZIP7 activity is tolerated by most tissues [133], strongly suggesting that ZIP7 could be targeted for Bcell proliferative diseases.
The discovery and development of ZIP inhibitors as anti-cancer drugs have just been emerging. In a phenotype-based high-throughput screen targeting the Notch pathway, a specific ZIP7 inhibitor, which disrupts Notch trafficking and induces an ER stress-mediated apoptosis, was identified, and verified by affinity-labeling [130]. Mapping the compound-resistant mutations on the structural model of ZIP7 suggested that the inhibitor binds within the transport pathway. The compound, which was named as NVS-ZP7-4, is the first reported ZIP-specific small molecule inhibitor. ZIP6 or ZIP10 specific antibodies were recently demonstrated to be able to block mitosis of cultured cells, highlighting the necessity of zinc influx through the ZIP6/ZIP10 heteromer for mitosis and also offering a new therapeutic strategy to inhibit cell proliferation [124]. A ZIP6-specific monoclonal antibody conjugated with a microtubule-disrupting agent was reported to significantly reduce xenografts' growth and is under development to treat metastatic breast cancer [135]. To identify potent and specific ZIP modulators (inhibitors or activators), it would be necessary to develop ZIP-targeting highthroughput screens, which has been done on human ZIP2 [136]. Given that the ZIPs share a conserved TMD, modulating the function of the ZIP of interest with high selectivity may be better achieved by targeting the regulatory domains (including the ECD and the IL2) or the components involved in transcriptional or post-translational regulation.

Conclusion and perspectives
The last few years have witnessed a rapid progress on the research of the ZIP family. This review highlights the aspects of structural biology and evolution, as well as implications on drug discovery against cancer. However, when compared to other well-characterised metal transporter families, significant knowledge gaps are yet to be filled. Addressing these issues at molecular level will provide invaluable knowledge for potential applications in biomedicine and biotechnology.
One fundamental issue is the transport mechanism, which can be broken down into three interrelated questions. Firstly, what is the structural basis of ZIP-mediated transport? Structural characterisation of alternative conformations would be essential to solve this problem. Secondly, what is the driving force to allow zinc ions to move through the membrane? As the electrochemical gradient of zinc ions across plasma membranes or organelle membranes favours zinc flux into cytoplasm, one would expect that these importers may be passive transporters. However, for some family members under certain circumstances, a co-transport mechanism may allow more efficient transport and an additional method of activity regulation. Given that the ZIP family is highly diverse in function, mechanistic studies on more family members (particularly those from distinct subfamilies) would draw a clearer and bigger picture on this issue which is deemed to be closely associated with physiological and pathological functions and regulations of the ZIPs. Thirdly, how do different ZIPs specifically recognise and transport distinct metal substrates? Nature has created a variety of ZIPs which exhibit quite distinct substrate specificity to fulfill certain biological functions. For instance, ZIP4 is highly specific toward Zn 2+ , whereas ZIP8/ZIP14 cannot distinguish Zn 2+ , Fe 2+ , Mn 2+ and Cd 2+ . A certain substrate specificity is achieved by exploiting the subtly different physicochemical properties among bioavailable metal ions (size, electron configuration, polarisability of electron cloud, ligand exchange kinetics, et al.) [137], through billions of years of evolution, during which numerous amino acid substitutions occurred. The molecular underpinnings determining substrate specificity must be encoded within the amino acid sequences, but it is yet to establish which residues are critically and potentially epistatically involved in defining substrate preference and the underlying biophysical mechanism. Clarifying the molecular mechanism of substrate specificity will substantially help rational engineering of a transporter of interest. One potential application is to create new-to-nature plant metal transporters with desired properties for exclusion of toxic metal (especially cadmium) from food of plant origin, biofortification of beneficial metal and phytoremediation [33].
Another significant challenge is centered on the regulation mechanisms of the ZIPs. It has been well documented that the eukaryotic ZIPs are tightly regulated by metal substrate (or non-substrate metal) availability at both transcriptional level and post-translational level [7,19,50,54,63,66,67,[138][139][140][141][142][143][144][145][146][147]. However, the molecular underpinnings of these sophisticated regulatory processes are still largely unknown. In particular, how metal sensing is coupled with the intermolecular interactions key for these processes is still an unsolved problem, requiring joint efforts from cell and molecular biology to biochemistry and structural biology. Post-translational modifications (such as phosphorylation) of the ZIPs may be at a crossroads between ZIP regulation, zinc signaling, and other canonical cell signaling pathways, and deserves more attention.