Dual Lineage-Specific Expression of Sox17 During Mouse Embryogenesis§


  • Eunyoung Choi,

    1. Center for Stem Cell Biology and Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, Tennessee, USA
    Search for more papers by this author
  • Marine R-C. Kraus,

    1. Swiss Institute for Experimental Cancer Research, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
    Search for more papers by this author
  • Laurence A. Lemaire,

    1. Swiss Institute for Experimental Cancer Research, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
    2. DanStem, University of Copenhagen, Copenhagen N, Denmark
    Search for more papers by this author
  • Momoko Yoshimoto,

    1. Department of Pediatrics, Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, Indiana, USA
    Search for more papers by this author
  • Sasidhar Vemula,

    1. Department of Pediatrics, Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, Indiana, USA
    Search for more papers by this author
  • Leah A. Potter,

    1. Center for Stem Cell Biology and Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, Tennessee, USA
    Search for more papers by this author
  • Elisabetta Manduchi,

    1. Penn Center for Bioinformatics and Department of Genetics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA
    Search for more papers by this author
  • Christian J. Stoeckert Jr.,

    1. Penn Center for Bioinformatics and Department of Genetics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA
    Search for more papers by this author
  • Anne Grapin-Botton,

    1. Swiss Institute for Experimental Cancer Research, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
    2. DanStem, University of Copenhagen, Copenhagen N, Denmark
    Search for more papers by this author
  • Mark A. Magnuson

    Corresponding author
    1. Center for Stem Cell Biology and Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, Tennessee, USA
    • 9465 MRB-IV, 2213 Garland Avenue, Vanderbilt University School of Medicine, Nashville, Tennessee 37232-0494, USA

    Search for more papers by this author
    • Telephone: 615-322-7006; Fax: 615-332-6645

  • Author contributions: E.C.: conception and design, collection and assembly of data, data analysis and interpretation, and manuscript writing; M.R-C.K., L.A.L., M.Y., and S.V.: collection and assembly of data; L.A.P.: data analysis and manuscript writing; E.M. and C.J.S.: data analysis and interpretation and manuscript evaluation; A.G-B.: provision of study material, financial support, and manuscript evaluation; M.A.M: conception and design, data analysis, financial support, manuscript writing, and final approval of manuscript.

  • Disclosure of potential conflicts of interest is found at the end of this article.

  • §

    First published online in STEM CELLSEXPRESS August 3, 2012. available online without subscription through the open access option


Sox17 is essential for both endoderm development and fetal hematopoietic stem cell (HSC) maintenance. While endoderm-derived organs are well known to originate from Sox17-expressing cells, it is less certain whether fetal HSCs also originate from Sox17-expressing cells. By generating a Sox17GFPCre allele and using it to assess the fate of Sox17-expressing cells during embryogenesis, we confirmed that both endodermal and a part of definitive hematopoietic cells are derived from Sox17-positive cells. Prior to E9.5, the expression of Sox17 is restricted to the endoderm lineage. However, at E9.5 Sox17 is expressed in the endothelial cells (ECs) at the para-aortic splanchnopleural region that contribute to the formation of HSCs at a later stage. The identification of two distinct progenitor cell populations that express Sox17 at E9.5 was confirmed using fluorescence-activated cell sorting together with RNA-Seq to determine the gene expression profiles of the two cell populations. Interestingly, this analysis revealed differences in the RNA processing of the Sox17 mRNA during embryogenesis. Taken together, these results indicate that Sox17 is expressed in progenitor cells derived from two different germ layers, further demonstrating the complex expression pattern of this gene and suggesting caution when using Sox17 as a lineage-specific marker. STEM Cells2012;30:2297–2308


Sox17, a member of the Sry-related high mobility group box (Sox) transcription factors, plays an essential role in the differentiation of many types of cells [1–4]. During mouse embryogenesis, Sox17 is first detected in extraembryonic visceral endoderm at embryonic day (E) 6.0 and in the endoderm of mid- to late-gastrula stage embryos (e.g., around E7.5) where it plays an essential role in organ formation [5]. Epithelial cells of the gut tube endoderm maintain Sox17 expression until approximately E8.5 as they undergo specification into distinct endoderm-derived organs [5–8]. Expression of Sox17 in endoderm, albeit transient, has led to this gene being widely used as a marker for definitive endoderm in studies using embryonic stem cells (ESCs) [9–12].

Within the developing endoderm, Sox17 is critical for specifying pancreatic progenitors in the ventral foregut endoderm. Mice that are globally deficient in Sox17 fail to undergo axis rotation at E8.5 and exhibit a severe defect of the posterior region of the embryo [5, 13]. Moreover, while Sox17-null embryos express Hnf3a/b and other endoderm-derived organ-specific markers such as Hhex and Cdx2, they do not express Pdx1, an essential factor for pancreatic outgrowth and development [5, 14]. Interestingly, Sox17 is expressed in a ventral foregut endoderm region from which the ventral pancreas and extrahepatobiliary ducts emerge where it appears to be essential for the segregation between liver and pancreatibiliary systems [15].

Sox17 also plays a crucial role in the maintenance of fetal and neonatal hematopoietic stem cell (HSC) identity [13, 16]. During vascular system development, Sox17 expression begins at approximately E8.5 [17]. However, the precise time at which Sox17 is expressed during embryonic hematopoiesis has not been as thoroughly investigated [18]. During embryogenesis, HSCs originate from the hemogenic endothelial cells (ECs) located on the aortic floor at E10.5 and migrate to the liver to expand in number [19–24]. Utilizing Tie2-Cre to conditionally eliminate Sox17 expression, Kim et al. [13] investigated the role of Sox17 in HSC development postmigration and found a marked impairment in the number of HSCs in the liver at E11.5. A role for Sox17 in fetal HSC maintenance was further supported by He et al. [16] who reported that the transient expression of Sox17 in adult bone marrow (BM) caused adult hematopoietic cells to adopt characteristics of fetal HSCs. However, while both of these studies demonstrated the importance of Sox17 in the maintenance of the fetal HSCs, neither explored whether Sox17-expressing ECs can give rise to hematopoietic cells or not.

In this study, we generated mice with a Sox17GFPCre allele and used it to identify Sox17-expressing cells and their progeny. At E9.5, we identified two distinct progenitor populations of Sox17-expressing cells, both an endoderm-derived ventral pancreatic epithelial cell population and a mesoderm-derived EC population that has hemogenic potential. Furthermore, we show that the two populations exhibit distinct gene expression signatures as assessed by whole transcriptome profiling.


Gene Targeting and Recombinase-Mediated Cassette Exchange

Mice containing a Sox17GFPCre allele were derived using both gene targeting and recombinase-mediated cassette exchange (RMCE). First, a Sox17 loxed cassette acceptor (LCA) allele was made by gene targeting. The targeting vector made by BAC recombineering replaced a 3.793 kb region of the gene containing exons 3–5 with a puromycin resistance-Δthymidine kinase (puΔTK) fusion gene driven by the mouse phosphoglycerol kinase (PGK) promoter and a kanamycin resistance gene driven by the bacterial EM7 promoter. The selection cassettes were flanked by tandemly oriented lox71 and lox2272 sites, two homology arms, and a PGK-diphtheria toxin A cassette for negative selection. After linearization using NotI, a 129S6-derived mouse ESC line was electroporated with the targeting vector, and puromycin-resistant clones were isolated. Homologous recombination was verified by Southern hybridization with 5′ and 3′ probes following digestion with either SphI or SpeI. Second, an exchange cassette flanked by tandem lox66 and lox2272 sites was made with an enhanced green fluorescent protein (GFP) and Cre fusion gene in place of the coding sequences of Sox17 and replace the selectable markers in the Sox17LCA allele. In addition, the vector contained a PGK-hygromycin resistance (HygroR) cassette flanked by tandem flippase recognition target (FRT) sites. Following coelectroporation of the exchange vector and a Cre-expression plasmid, positively exchanged clones were identified by a staggered positive–negative selection strategy using both hygromycin and gancyclovir [25]. To identify correctly exchanged clones, polymerase chain reaction (PCR) was performed using the following primers: 5′-ACAGTCTTACACGCTACGGAT and 5′-CAAGACCTCTTGGGGAAATAG on the 5′ end (a); 5′-CAGAGGTATGCAGATCTCTGT and 5′-CATTCTGGTCAACATGTAAGGT on the 3′ end (b).

Mouse Strains

Chimeric mice containing the Sox17GFPCre allele were derived by the microinjection of clone 1G3:1C10 ESCs into blastocysts of C57BL/6J mice. After germline transmission, the HygroR cassette was removed by cross-breeding with ACTB:FLPe mice [26]. The Sox17GFPCre allele was maintained within a CD1 background for experiments. Mice with the R26ReYFP and R26RLacZ (Rosa26:LacZ (Gt(ROSA)26Sortm1Sor)) alleles were previously described [27]. Embryos were considered to be E0.5 at noon on the day a vaginal plug was detected. All experimental protocols were approved by the Vanderbilt Institutional Animal Care and Use Committee.


Five micron frozen sections were preincubated with 5% normal donkey serum (NDS) for 30 minutes at room temperature, incubated with primary antibodies at 4°C overnight, and washed in phosphate buffered saline (PBS) with 0.1% Tween 20. Secondary antibodies were incubated at room temperature for 1 hour then washed in PBS containing 0.1% Tween 20. All antibodies were diluted in PBS containing 5% NDS and 0.1% Tween 20. The sources of the antibodies used are listed in supporting information Table S3. Images were acquired using either a Zeiss LSM510 or LSM710 inverted confocal microscope at the Vanderbilt Cell Imaging Shared Resource.

β-D-Galactosidase Staining

Sox17GFPCre/+;R26RLacZ/+ embryos were harvested at E7.5, E9.5, or E12.5 and fixed at room temperature for 10 minutes in 4% paraformaldehyde. 5-Bromo-4-chloro-3-indolyl β-D-galactosidase staining was performed on 10-μm-thick serial transversal sections for 4 hours at 37°C. Images were acquired with a Leica DM5500B microscope equipped with a DFC 320 color camera.

Fluorescence-Activated Cell Sorting Analysis

Sox17GFPCre/+ embryos were identified by direct fluorescence using a Leica MZ 16 FA stereoscope. The pooled embryos, containing five to seven embryos, were dissociated with AccuMax (Sigma, St. Louis, MO, http://www.sigmaaldrich.com/) and 5 μg/ml DNase I (Sigma) and passed through a 35 μm mesh filter into fluorescence-activated cell sorting (FACS) tubes (BD Biosciences, San Diego, CA, http://www.bdbiosciences.com). After centrifugation, the cells were resuspended with FACS staining buffer (R&D Systems, Minneapolis, MN, http://www.rndsystems.com), blocked using 1 μg/ml of mouse IgG at room temperature for 15 minutes, then immunolabeled with phycoerythrin (PE) conjugated-epithelial cell adhesion molecule (EpCAM) (G8.8) antibody (Santa Cruz Biotechnology, Santa Cruz, CA, http://www.scbt.com) at room temperature for 30 minutes. The cells were washed three times in FACS staining buffer, centrifuged, and resuspended with FACS staining buffer with 10 mM HEPES, 1 mg/ml bovine serum albumin (BSA), and 1% penicillin/streptomycin. 7-Aminoactinomycin D (Molecular Probes, Eugene, OR, http://probes.invitrogen.com) was added at a dilution of 1:1,000 to assess cell viability, and cells were analyzed using an LSRII (BD Biosciences, San Diego, CA, http://www.bdbiosciences.com). To isolate the Sox17-expressing cell populations, the midgut regions from 26–29 Sox17GFPCre/+ embryos at E9.5 were dissected (Supporting Information Fig. S5C) and pooled prior to cellular dissociation. Samples were prepared as described above, and cells were isolated using either an Aria II or III (BD Biosciences, San Diego, CA, http://www.bdbiosciences.com).

Fetal liver (FL) cells were obtained from dissected E12.5 embryo livers then dissociated into a single cell suspension using a 200 μl large orifice tip pipette. BM was collected from the tibias and femurs of 6-week-old mice. The cells were flushed using 1× PBS through a 21-gauge needle more than 10 times and then filtered with a 100 μm cell strainer. Red blood cells were lysed using ACK lysis buffer (0.15 M NH4Cl, 1 mM KHCO3, 0.1 mM Na2EDTA, pH 7.4, sterilized using 0.2 mm filter). The following antibodies were used for immunolabeling: anti-CD41 (MWReg30), anti-VE-Cad (BV13), anti-CD45 (30-F11), anti-CD19 (1D3), anti-CD11b/Mac1 (M1/70), anti-CD3 (17A2), and anti-Ter119. 4′,6-diamidino-2-phenylindole (DAPI) (1 μg/ml) was added to assess cell viability.

Hemogenic EC Culture

GFPCre+ or GFPCre ECs (CD45Ter119CD41VE-cad+CD31+ cells) from the yolk sac (YS) and para-aortic splanchnopleural (P-Sp) region of E9.5 Sox17GFPCre/+ embryos were sorted and 3,000 cells were plated onto an OP9 stromal cell layer in six-well plates with stem cell factor (SCF) (10 ng/ml), Flit3-ligand (10 ng/ml), IL3 (10 ng/ml), IL7 (10 ng/ml), and thrombopoietin (TPO) (10 ng/ml). ECs from wild-type YS and P-Sp were also plated as a positive control. Floating cells in the coculture were collected every 3–4 days and analyzed by flow cytometry using anti-TER119 and Mac1 antibodies.

Total RNA Isolation, Amplification, Library Construction, RNA-Seq and Analysis

Total RNA was isolated using TRIzol LS (Invitrogen, Carlsbad, CA, http://www.invitrogen.com) containing 40 μg/ml mussel glycogen (Sigma). Following treatment with DNase (Ambion, Austin, TX, http://www.ambion.com), total RNA was column-purified using a DNA-Free RNA kit (Zymo Research). RNA integrity was assessed using an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, http://www.agilent.com), and RNAs with an RNA integrity number > 7.0 were processed for RNA-Seq. Total RNA (10 ng) was amplified using the Ovation RNA-Seq system (NuGen, CA), and amplified cDNA (750 ng) was sheared using Covaris S2 system (Covaris, MA) and used to construct the cDNA library using the Illumina TruSeq DNA prep kit (Illumina, CA). The libraries were used to generate ≥110 base reads as single-end tags using an Illumina HiSeq 2000, and the HiSeq Control Software (HCS 1.4.8/RTA 1.12) was used for image analysis. Reads were mapped to mm9 genome with RNA-Seq unified mapper (RUM, v1.09, http://www.cbil.upenn.edu/RUM/) [28]. Expression was quantified as reads per kilobase of exon model per million mapped reads (RPKM). A summary of the mapped reads is in supporting information Table S4, and confidence values were determined by PaGE (http://www.cbil.upenn.edu/PaGE/) [29]. Differentially expressed genes were selected as those displaying at least a fourfold change in RPKM, ≥ 4 RPKM in one sample, and ≥80% confidence value, and these genes were clustered using Cluster program [30]. Gene Ontology analysis was performed using PANTHER (v7.0) [31].

Metadata and processed data for the RNA-Seq are available at http://genomics.betacell.org. The RNA-Seq data has been deposited at ArrayExpress (accession number E-MTAB-970) and the Sequence Read Archive (accession number ERP001235).

Semiquantitative Real-Time PCR

Real-time PCR (RT-PCR) was performed using 50 ng of amplified cDNA template used for RNA-Seq. Sox17 transcript variants were detected using the following primers: 5′-GGATACGCCAGTGACGACCA with 5′-CGTTCGTCTTTGGCCCACAC (a) or 5′-TCATGCGCTTCACCTGCTTG (b); 5′-ATGGCCCACTCACACTGCTG with 5′-ATGTAGCTCTCCTGCCTCTC (c) or 5′-CGTTCGTCTTTGGCCCACAC (d).


Derivation of Mice with a Sox17GFPCre Allele

To explore the stage-specific expression and lineage of Sox17-expressing cells, we derived mice that express a GFPCre fusion protein under control of the endogenous Sox17 gene locus through gene targeting and RMCE [32] (Fig. 1A). First, we performed gene targeting to generate mouse ESCs containing a Sox17LCA allele. Two of 84 mouse ESC clones that survived puromycin selection were correctly targeted as confirmed by Southern hybridization (Fig. 1B). Second, we generated an exchange cassette that replaced coding sequences in the Sox17 gene with a GFPCre fusion protein. Following RMCE into the Sox17LCA allele, the exchange was confirmed using PCR across both the lox66/71 and lox2272 sites, and one clone (1G3:1C10) was used to generate mice containing the Sox17GFPCre(+HygroR) allele (Fig. 1C). After germline transmission, mice containing the Sox17GFPCre(+HygroR) allele were bred with FLPe-expressing mice to delete the FRT-flanked hygromycin resistance cassette, thereby establishing the Sox17GFPCre line.

Figure 1.

Generation of the Sox17GFPCre allele. (A): Diagram of the Sox17 locus, targeting vector, Sox17LCA allele, GFPCre exchange cassette, Sox17GFPCre(+HygroR), and Sox17GFPCre allele. A targeting vector for the mouse Sox17 gene was constructed where sequence including exons 3–5, which contain the coding region of Sox17, was replaced with a puromycin resistance-Δ-thymidine kinase fusion gene (puΔTK) and an EM7-driven kanamycin resistance gene (KanR) flanked by lox66 (open triangle) and lox2272 (black triangle) sites. The GFPCre exchange cassette was flanked by lox71 (gray triangle) and lox2272 sites and contained a phosphoglycerol kinase-driven hygromycin resistance gene (HygroR) flanked by flippase recognition target sites (open circles). Following exchange into Sox17LCA-containing mouse embryonic stem cells by recombinase-mediated cassette exchange (RMCE), mice containing the Sox17GFPCre(+HygroR) allele were bred with FLPe-expressing transgenic mice, thereby generating the final Sox17GFPCre allele. Polymerase chain reaction (PCR) amplifications for 5′ and 3′ screening of Sox17GFPCre(+HygroR) allele depicted as a and b. (B): Southern blot analysis of genomic DNA from puromycin-resistant Sox17LCA ESCs. DNA was digested with SphI or SpeI and hybridized with a 5′ or 3′ probe as indicated in panel (A). Clones 1C1 and 1G3 were correctly targeted by presence of a 12.3 kb and 11.1 kb band on the 5′ and 3′ ends, respectively. Clone 1G3 was used for RMCE. (C): PCR screening of Sox17GFPCre(+HygroR) exchanged clones. The proper exchange of clone 1C10 was identified by 660 and 1,006 basepair (bp) bands on the 5′ and 3′ ends, respectively. Abbreviations: DT-A, Diphtheria toxin A; LA, long arm; LCA, loxed cassette acceptor; SA, short arm; WT, wild type.

To determine whether expression of the Sox17GFPCre reporter allele faithfully recapitulates that of the wild-type Sox17 allele, we first performed whole mount fluorescence microscopy of Sox17GFPCre/+ mouse embryos. In agreement with previous reports [5, 15], we observed fluorescence in the extraembryonic region beginning at E6.5 (Fig. 2A-a,b) and in the intraembryonic region beginning at E7.5 (Fig. 2A-c,d). A three-dimensional reconstruction of E7.5 embryo images revealed that GFPCre-expressing cells were highly abundant and clustered together in the definitive endoderm (DE) rather than the visceral endoderm (supporting information Fig. S1). At E8.5, between the fourth and ninth somite (S) stage, fluorescence was localized to the foregut endoderm (Fig. 2A-e,f), and histological analysis revealed coexpression with Sox17 (Fig. 2B-a). At E9.5 (> 20S), fluorescence throughout the endoderm was greatly diminished, but localized expression was evident in the ventral pancreas region (Fig. 2A-g,h and 2B-b) At E9.5, we also observed cells expressing both GFPCre and Sox17 in the dorsal aorta (Fig. 2B-c). This finding indicates that Sox17 is not only expressed in the ventral pancreatic epithelium at E9.5 but also present in cells in the dorsal aorta.

Figure 2.

Expression pattern of Sox17GFPCre/+ during development. (A): GFPCre expression was observed from E6.5 to E9.5 in Sox17GFPCre/+ embryos. At E6.5, GFPCre was observed only in the extraembryonic region (a and b). Conversely, at E7.5, it was seen in the embryonic region (c and d), specially the definitive endoderm area (arrow). At E8.5, GFPCre was observed in the gut tube area (e and f) with expression in the foregut region marked (arrow). At E9.5, the expression was diminished in the gut (g and h) but was still seen in the ventral pancreatic bud (arrow). Scale bar = 100 μm. (B): Immunolabeling revealed colocalization of both GFPCre and Sox17 in E8.5 and E9.5 Sox17GFPCre/+ mouse embryos. At E8.5, GFPCre and Sox17 were coexpressed in the foregut endoderm (a) and at E9.5 in the ventral pancreatic bud (b) and the para-aortic splanchnopleural area (c). White boxed areas depict regions enlarged. Scale bar = 50 μm. Abbreviations: A, anterior; Di, distal; D, dorsal; FG, foregut; GFP, green fluorescent protein; P, posterior; Pr, proximal; P-Sp, para-aortic splanchnopleural; V, ventral, VP, ventral pancreatic bud.

Two Types of Sox17GFPCre-Expressing Cells During Organogenesis in the Mouse Embryo

At E9.5, Sox17 was detected primarily in the ventral pancreatic epithelium but not in the liver epithelium (Fig. 3A). GFPCre was strongly expressed in the tip area of the ventral pancreatic epithelium (Fig. 3A, left arrow) and at a lower level in the Pdx1-high area of ventral pancreatic epithelium (Fig. 3A, right arrow) [15]. Several GFPCre-expressing cells were also located in the region of the liver epithelium; however, they were not colocalized with the liver marker Hnf4a, indicating that these GFPCre-expressing cells were separate but intermingled with the developing liver epithelium at this stage (Fig. 3A). We also observed GFPCre-expressing cells localized within the dorsal aorta (Fig. 3B) [17] and in the vicinity of the neural tube area (Fig. 3C). While these GFPCre-expressing cells were not colocalized with Sox2, an ectoderm marker, Foxa2, a floor plate marker, or Sox10, a marker of neural crest cells, they were colocalized with platelet endothelial cell adhesion molecule (PECAM), an EC marker (Fig. 3E, arrow) [17]. Conversely, the GFPCre-expressing cells in the ventral pancreatobiliary epithelium colocalized with EpCAM, an epithelial cell marker (Fig. 3D, arrow). These data indicate that Sox17 is expressed in two distinct progenitor cell populations at E9.5: (a) an epithelial cell population found in the ventral foregut epithelium that gives rise to the ventral pancreas, extrahepatic ducts, and gall bladder and (b) an EC population found in the P-Sp area, which includes the dorsal aorta.

Figure 3.

Two distinct types of Sox17GFPCre-expressing cells at E9.5. (A): Immunolabeling revealed that GFP expression at E9.5 in the VP was colocalized with Pdx1 (arrows). Several GFPCre-expressing cells were found in the Li but did not colocalize with Hnf4α (arrowheads). (B): GFP was detected in the DA; however, it did not colocalize with Sox2, an early ectoderm marker, or Foxa2, a floor plate marker (arrow). (C): GFP was detected in the NT; however, it did not colocalize either Sox2 or Sox10, a neural crest cell marker (arrow). (D): GFP in the VP colocalized with EpCAM (arrow) but not with PECAM (arrowheads). (E): GFP in the NT colocalized with PECAM (arrow). White boxed areas depict regions enlarged. Scale bar = 50 μm. Abbreviations: DA, dorsal aorta; DAPI, 4′,6-diamidino-2-phenylindole; EpCAM, epithelial cell adhesion molecule; FP, floor plate; GFP, green fluorescent protein; Li, liver bud region; NT, neural tube; NC, neural crest cells; PECAM, platelet endothelial cell adhesion molecule; VP, ventral pancreatic bud.

Two Distinct Cell Populations Expressing Sox17 Are Alternatively Derived from Two Different Origins

To trace the lineage of Sox17-expressing cells, the Sox17GFPCre mice were crossed with either a Gt(ROSA)26Sortm1Sor (R26RLacZ) or ROSA26-eYFP (R26ReYFP) cre-reporter line [27, 33] and the locations of either LacZ or yellow fluorescent protein (YFP), respectively, were determined (Fig. 4; supporting information Fig. S2). Consistent with previous studies, LacZ was detected as early as E7.5 in extraembryonic and embryonic visceral endoderm as well as the definitive endoderm where it was detected primarily in the proximal endoderm (data not shown) [34]. Previous studies have also shown that Sox17 plays a key role in determining endoderm and pancreas fates [5, 35]. Consistent with this [15, 34, 36], we found that at E9.5 all endoderm-derived epithelial cells, from the branchial arch endoderm to the hindgut, were derived from cells that once expressed Sox17 (supporting information Fig. S2A-S2D). At E9.5, YFP was coexpressed with EpCAM or Foxa2 within the gut endoderm and the pancreatic epithelium (Fig. 4A; supporting information Fig. S3A). At E12.5, we confirmed that endoderm-derived organs, including the thyroid, thymus, parathyroid, esophagus, trachea, lungs, stomach, liver epithelium and endothelium (but not blood cells), dorsal and ventral pancreas, small intestine, cecum, and large intestine were labeled by LacZ (supporting information Fig. S2E-S2G). We also observed YFP labeling in Ptf1a-expressing pancreatic multipotent progenitor cells (MPCs) at E11.5. Additionally, as observed at E15.5, acinar, ductal, endocrine progenitor, and insulin-expressing cells were derived from a Sox17-expressing lineage (supporting information Fig. S3B).

Figure 4.

Fate tracing of Sox17-expressing cells at E9.5. (A): In E9.5 R26ReYFP;Sox17GFPCre mouse embryos, YFP was detected in the VP and GT [37] and displayed colocalization with EpCAM. YFP also colocalized with Sox17GFPCre-expressing cells in the VP (arrow). (B): YFP was detected in the DA and colocalized with PECAM. Some YFP-positive cells also colocalized with Sox17 (arrow). (C): YFP colocalized with PECAM in the endocardium of heart (H) (arrow); however, it did not colocalize with VCAM in the myocardium (arrowheads). White boxed areas depict regions enlarged. Scale bar = 50 μm. Abbreviations: DA, dorsal aorta; EpCAM, epithelial cell adhesion molecule; GT, gut tube; H, heart; VP, ventral pancreatic bud; VCAM, vascular cell adhesion protein; YFP, yellow fluorescent protein.

In addition to contributing to endoderm-derived lineages, we observed YFP-expressing cells in the endothelia, such as dorsal aorta and veins at E9.5 (Fig. 4B; supporting information Fig. S4A). An analysis of YFP-positive cells in the heart, which is composed of both ECs and mesenchymal cells, showed that YFP was selectively observed in the PECAM-positive ECs (Fig. 4C, arrow) but not in the vascular cell adhesion protein (VCAM)-positive cardiac myocytes, representing cells of the mesenchymal lineage [38] (Fig. 4C, arrowheads). These findings indicate that Sox17 is not expressed in mesoderm progenitor cells that gives rise to both heart endothelium and mesenchyme [39] but is expressed after the mesoderm is specified into either the mesenchymal or endothelial lineage.

Sox17 Is Expressed in Hemogenic ECs at E9.5

While it has been previously shown that Sox17 has a role in the maintenance of fetal and neonatal HSCs [13], it is not clear from these studies whether Sox17 is expressed in the hemogenic ECs in the YS and in the para-aortic splanchnopleural (P-Sp) region. Given that Sox17 was detected in the ECs of P-Sp region, we sought to determine whether the Sox17-expressing cells were hematopoietic in nature. At E9.5, many GFPCre-expressing cells along the aortic wall coexpressed c-Kit (Fig. 5A, arrows). However, GFP was not colocalized in CD41+c-Kit+ hematopoietic cells (HPCs) [40] (Fig. 5A, arrowheads), indicating that Sox17 expression occurs only in the endothelial stage and is not sustained as the cells differentiate into hematopoietic cells (HPCs). In contrast, some Sox17-derived cells, as indicated by YFP expression, were detected in CD41+c-Kit+ HPCs (Fig. 5B, arrowheads), indicating that CD41+c-Kit+ HPCs were derived from Sox17 expressing cells. Consistent with this, a few CD41+c-Kit+ cells (∼1%) were produced when GFPCre+ ECs were cultured on OP9 cells (data not shown). To determine whether Sox17-expressing ECs have hemogenic potential we sorted GFPCre+ ECs (CD45CD41Ter119CD31+ or VEcad+ cells), which accounted for around 40%–50% of the ECs, from E9.5 YS and P-Sp and plated them on an OP9 stromal cell layer along with stimulatory cytokines. After 8 days of coculture, large cobblestone-appearing areas were observed that were proven to be Ter119+ or Mac1+ erythro-myeloid colonies by FACS (Fig. 5C). Of note, GFPCre EC also produced erythro-myeloid cells in the OP9 coculture (data not shown). Taken together, these data indicate that Sox17 is expressed in the hemogenic ECs.

Figure 5.

Sox17-expressing endothelial cells exhibit hemogenic potential. (A): In E9.5 Sox17GFPCre/+ embryos, GFP was detected in the P-Sp area. GFP colocalized with c-Kit-positive cells in the aortic floor (arrows); however, GFP was diminished or not detected in CD41-positive and/or c-Kit positive hematopoietic cells (arrowheads). White boxed areas depict regions enlarged. Scale bar = 50 μm. (B): In E9.5 R26ReYFP;Sox17GFPCre embryos, YFP was detected in the P-Sp area. YFP colocalized with c-Kit- and CD41-positive cells (arrowheads). (C): Both erythrocytes and myeloid (Ter119+ or Mac1+) cells were differentiated from wild-type and Sox17-expressing endothelial cells obtained from E9.5 YS and P-Sp. (D): YFP expression in hematopoietic cells from E9.5 embryos, E12.5 FL, and adult mouse BM of R26ReYFP;Sox17GFPCre mice (n ≥ 3). YFP expression was analyzed with hematopoietic cell marker-gated cells. Abbreviations: BM, bone marrow; FL, fetal liver; GFP, green fluorescent protein; P-Sp, para-aortic splanchnopleural; VE-Cad, vascular endothelial cadherin; YFP, yellow fluorescent protein; YS, yolk sac.

Use of Sox17 Lineage Tracing to Show the Development of Definitive Hematopoiesis In Vivo

We next sought to determine how many hematopoietic cells are derived from Sox17-expressing cells in vivo. To do so, we determined the number of Sox17-lineaged cells in E9.5 embryos, E12.0 FL, and in adult BM (Fig. 5D). While only 25.1% ± 12.3% of CD41+ cells, 9.5% ± 2% of Ter119+ cells, and 21.4% ± 11.1% of Mac1+ cells were YFP+ in the E9.5 embryos (Fig. 5D, upper panel), the number of YFP+ cells was dramatically increased in the E12.5 FL. Indeed, up to 81.9 ± 2.2 of the CD41+ cells, 39.5% ± 5.5% of the Ter119+ cells and 81.6% ± 4.1% the Mac1+ cells were lineage-labeled (Fig. 5D, middle panel). Furthermore, YFP+ cells were also detected among CD19+ cells (55.8% ± 10.4%) and CD3+ cells (58.9% ± 1.1%), representing lymphocytes. YFP+ cells were also detected in the vascular endothelial-cadherin (VE-Cad)+ cells (81.9% ± 2.2%). These results suggest that Sox17 is expressed in VE-cad+ ECs and is linked to the production of definitive hematopoietic cells, perhaps through hemogenic ECs, based on the previous reports [41–43]. In adult BM, all CD45+ hematopoietic cells were YFP+, clearly demonstrating that all the blood cells in the adult BM are derived from Sox17-expressing cells. This is consistent with the previous report that HSCs in the FL express Sox17 [13], and that FL HSCs migrate into the BM and support hematopoiesis, as is commonly considered. However, the low percentage of YFP+ cells among Ter119+ cells were detected in adult BM, likewise in E12.5 FL. Since definitive erythroid cells are enucleated, these cells may lose the expression of YFP as they mature. Also, approximately 17% of the CD3+ cells were YFP. While this may suggest that T-cell lymphopoiesis occurs in Sox17 hemogenic ECs in the YS and P-Sp before emergence of HSCs [43], this finding could also be due to a low amount of GFPCre in the hemogenic ECs resulting in only partial recombination of the R26ReYFP reporter allele. In either case, Sox17 marks a part of hemogenic ECs, and Sox17 lineage tracing shows the progression of definitive hematopoiesis in the embryo.

Sox17-Expressing ECs and Epithelial Cells Exhibit Transcriptional Differences in Distinct Protein Classes

To further demonstrate differences in the two Sox17-expressing populations, we isolated both cell populations using FACS and performed whole transcriptome sequencing (RNA-Seq). Using E9.5 Sox17GFPCre/+ mouse embryos, the epithelial cells and ECs were separated based on expression of EpCAM, an epithelial cell marker (supporting information Fig. S5A). From whole embryos, we obtained less than 1% GFPCre-expressing cells (supporting information Fig. S5A) of which only 7.0% ± 3.2% were Sox17-expressing epithelial cells based on the expression of both GFP and EpCAM (supporting information Fig. S5B). Since endodermal expression becomes restricted to the ventral posterior foregut at E9.5, we dissected the mid region of the embryo to increase the yield of Sox17-expressing cells (supporting information Fig. S5C; Fig. 6A). Even so, the total number of cells was low, thereby requiring RNA amplification prior to sequencing. As summarized in supporting information Table S1, 62.5%–75.6% of the RNA-Seq reads obtained were aligned to the mouse genome (mm9) [28]. Each profile from three biological replicates showed high reproducibility (supporting information Table S2). Interestingly, the profiles between EpCAM+ and EpCAM Sox17-expressing cells were also highly correlated, simply implying that the global gene expression profiles for the two populations are similar.

Figure 6.

Comparison of differentially expressed genes in two populations. (A): Fluorescence-activated cell sorting was used to isolate GFP/EpCAM co-positive cells representing ventral pancreatic epithelial cells (EpCAM+) and GFP+/EpCAM cells representing hemogenic endothelial cells (ECs) (EpCAM) from dissected E9.5 Sox17GFPCre/+ embryo midguts. (B): The distinct difference between the EpCAM+ and EpCAM cell populations is evident in the heat map, which displays the reads per kilobase of exon model per million mapped reads (RPKM) values for 321 genes from three biological replicates for either EpCAM+ or EpCAM. Black color corresponds to an RPKM value of 0, and the brightest red corresponds to ≥100 RPKM value. (C): The selected transcripts were clustered according to protein class, and the fold change (natural log scale) indicating gene expression in EpCAM+ cells as compared to EpCAM cells is shown. ** > 0.9 and * > 0.85 confidence value. Abbreviations: EpCAM, epithelial cell adhesion molecule; GFP, green fluorescent protein.

To identify differentially expressed transcripts within the two populations, we examined the transcriptional levels of 28,683 genes. While the overall gene expression profiles were similar (supporting information Table S2), the two populations were distinguished by many differentially expressed genes with high confidence values (supporting information Table S4). To more clearly illustrate the differences, we selected 321 genes that displayed a fourfold or greater change in expression (supporting information Table S5) and performed a cluster analysis. By doing so, differences between the epithelial and ECs became readily apparent (Fig. 6B). Furthermore, many of the differentially expressed genes are important for ventral foregut development, hematopoiesis and/or the regulation of cell fate and signal transduction based on their gene ontology annotations in PANTHER [44].

In the transcription factor cluster (PC00218), there were 11 upregulated and 26 downregulated genes in Sox17-expressing epithelial cells compared to the ECs (Fig. 6C). The upregulated transcripts primarily consisted of endoderm and pancreas development-related genes, such as Onecut1, Foxa3, Foxa2, Hnf1b, Sox9, and Prox1. However, other transcription factors, including Cited1, Tcea3, ID2, Lin28a, and Aes, were also highly expressed in the Sox17-expressing epithelial cells. Conversely, most of the downregulated genes, including Gata2, Ets1, Elk3, Lmo2, Kef2c, Klf2, Sox7, Tal1, and Sox18, are known to play a role in endothelial and HSC development. In the receptor (PC00197) and cell adhesion molecule (PC00070) clusters, we found 11 upregulated and 19 downregulated genes in Sox17-expressing epithelial cells as compared to the ECs (Fig. 6C). Tacstd1 (EpCAM), which was used to isolate the two Sox17-expressing cell populations, was one of the most abundant genes in epithelial cells, and several epithelial cell-related genes, such as Cdh1, Tmprss2, Emb, and Bcam, were also abundant. Paqr9, a member of the progestin and adipoQ receptor family, and Dlk1, a member of the epidermal growth factor-like family, were also highly expressed; however, their role in pancreas development has not been elucidated. Conversely, in the Sox17-expressing ECs, many genes important for EC development or hematopoiesis, such as Kdr, Eltd1, Eng, Cd97, Cd34, Esam, Nid1, were highly expressed. In the signaling molecule cluster (PC00207), we found 13 upregulated and 13 downregulated genes in Sox17-expressing epithelial cells as compared to the ECs (Fig. 6C). In agreement with studies highlighting the roles of Wnt and transforming growth factor beta (TGFb) signaling in pancreas development [45, 46], we found that Wnt signaling-related genes, Fzd7 and Sfrp5, and TGFβ signaling-related genes, such as Npnt and Bmp7, were abundant in epithelial cells. Similarly, Habp2, a hepatocyte growth factor activator-like protein and endoderm-enriched gene, and Sdc4, which is involved in organogenesis of the kidney, were highly expressed. These data not only confirm the existence of two distinct Sox17-expressing progenitor populations at E9.5 but also reveal numerous transcription factors and cell surface molecules that may be useful for distinguishing the two populations.

Alternative Splicing Variant of Sox17 Gene Is Expressed in Two Populations

Previously, tissue-specific splicing variants of Sox17 were identified in mouse adult testis and lung [47], and the presence of alternative transcription start sites has been suggested during embryonic development [34, 36]. An analysis of the RNA-Seq reads mapped to the Sox17 locus revealed differences in exon abundance (supporting information Fig. S6). Thus, to further explore whether the Sox17-expressing epithelial and ECs express unique Sox17 mRNA transcripts (Fig. 7A), we performed exon-specific RT-PCR (Fig. 7B). We found that the fourth and fifth exons, which contain the coding sequences, were amplified from cDNA from both Sox17-expressing epithelial and ECs (Fig. 7B-a,b). However, sequences in exons one, two and three were amplified only in the Sox17-expressing endothelial cDNA (Fig. 7B-c,d). This data indicate that the Sox17 pre-mRNA is alternatively processed in the two different cell types.

Figure 7.

Isolation of Sox17-expressing cells and identification of alternative variants of Sox17 transcript in EpCAM+ and EpCAM cells. (A): Schematic of two transcript variants of Sox17 (T1 and T2). Dark gray box shows coding regions. Black lines (a–d) indicate the amplified regions for polymerase chain reaction. (B): Both EpCAM+ and EpCAM cells' samples amplified sequence within the coding regions (a and b); however, only the EpCAM sample amplified regions spanning the first three exons. The band in the EpCAM+ lane in d is nonspecific (expected band size = 316 bp) (c and d). Abbreviation: EpCAM, epithelial cell adhesion molecule.


Our study indicates the existence of two distinct Sox17-expressing cell lineages in the P-Sp region of the mouse at E9.5. The first population of Sox17-expressing cells, DE (Fig. 2), goes on to form endoderm-derived organs, such as lung, stomach, pancreas, liver, and intestine. The second population is mesoderm-derived endothelium, which go on to contribute to the derivation of HPC. While the expression of Sox17 in the ECs and fetal HSCs has been previously reported using reporter mouse lines [13, 34, 48], it has not been clear when Sox17-expressing ECs and fetal HSCs begin to emerge.

Most mature blood cells at E9.5 are considered to be primitive erythro-myeloid cells derived directly from mesoderm, not ECs [49, 50]. Thus, the YFP+ hematopoietic cells detected in this study may reflect definitive hematopoiesis from hemogenic ECs, although there is a possibility that primitive hematopoietic cells are derived from Sox17 hemogenic ECs. At E12.5, when definitive hematopoiesis is increased, YFP+ cells were not only detected in erythro-myeloid cells but also in both CD19+ and CD3+ cells. This indicates that Sox17 also gives rise to lymphocyte cells. However, the fact that half of the CD19+ and CD3+ cells were YFP in the lineage tracing experiment suggest the need for further, even more detailed studies of the biological potential of Sox17-expressing ECs. Nonetheless, when taken together, our findings indicate that Sox17-expressing ECs have, at a minimum, the potential to form erythro-myeloid cells.

To better understand the characteristics of the two populations, we utilized EpCAM immunolabeling to isolate Sox17-expressing epithelial and ECs from dissected mouse embryos. Using whole transcriptome profiling, we determined that the gene signatures for the epithelial cells and ECs exhibit specific differences, and that the Sox17 mRNA in these two populations is alternatively processed (Fig. 7). However, the coding sequences for Sox17 in the two variants are unaffected. While it is well known that alternative splicing occurs in many genes during embryonic development [51] and is an important means of regulating gene expression, it is not known whether the two different Sox17 mRNAs that distinguish the ventral foregut progenitor and hemogenic EC populations have any functional significance.

Previous studies have reported the failure of normal gut tube and blood cell formation due to Sox17 deficiency [5, 13]. While some endoderm-derived organs such as the liver and thyroid are not dependent on Sox17 for development, the specification of other organs, such as the pancreas is dependent on Sox17 [5]. The transcriptome profiling we performed revealed the expression of many different genes, in addition to Sox17, known to be involved in endoderm and foregut development in the ventral pancreatobiliary epithelium at E9.5 (Fig. 6C). In endothelial development, previous studies have shown that the vascular system is unaffected in Sox17-null embryos at E8.25 [17]; however, at E11.5, there is a significant impairment in the number of HSCs in Sox17-null embryos [13].

In our analysis of differentially expressed genes, we found that Sox7 and Sox18, other members of Sox-F family, are more abundant in the Sox17-expressing ECs at E9.5 (Fig. 6C). While these Sox-F members have a redundant role in vascular development [17], it has been suggested that Sox7 and Sox18 have no impact in the generation of HSCs by zebrafish experiments performed with morpholinos to inhibit translation of Sox7 and Sox18 [52]. However, little is known about the role of Sox-F family members in HSC generation in mice. Our lineage tracing study shows that Sox17 is expressed in hemogenic ECs that give rise to definitive HPCs. To date, no in-depth studies have been performed to assess the role of the other Sox-F members in intraembryonic or extraembryonic hematopoiesis. Here, our study reveals that Sox17, one Sox-F member, contributes to embryonic HPC generation.

In spite of the highly similar global expression profiles for the two Sox17-expressing populations, the transcriptional profiles clearly revealed distinct differences as would be expected for two distinct cellular lineages. Many pancreatic and hematopoietic cell-type-specific genes distinguish the two populations, suggesting that combinatorial expression of tissue-specific transcription factors is crucial for the fate decision of Sox17-expressing cells. First, numerous pancreas-related genes are highly expressed in Sox17-expressing epithelial cells (Fig. 6C). Specifically, these cells express not only Onecut1 and Sox9, genes important for the continuance of pancreatic progenitor cell fate, but also Lin28a, a gene important for cell pluripotency [53–55]. The combinatorial expression of these transcription factors with Sox17 in the epithelial cells may be crucial for the specification of pancreatic progenitor cells. Furthermore, we identified high expression of Fzd7 (Frizzled-7), a Wnt receptor, in the Sox17-expressing epithelial cells. Given that Sox17 regulates transcription of endodermal target genes through the β-catenin pathway and Wnt/β-catenin signaling is required for the proliferation of pancreatic progenitor cells [56, 57], Fzd7 may mediate the Wnt/β-catenin signaling pathway during the pancreas segregation from the gut and biliary system within the Sox17-expressing epithelium. Second, a different set of genes is expressed in the Sox17-expressing ECs. Specifically, we observed the expression of transcription factors such as Gata2, Lmo2, Tal1, and Mef2c, all of which have roles in both vascular and HSC development [58–61], further supporting our identification of a novel Sox17-expressing hemogenic ECs during embryogenesis. The expression of CD34 and Kdr confirms endothelial phenotype. Indeed, given that multiple markers currently need to be used in combination to identify early HSC populations [62], the Sox17GFPCre mice we have derived may be useful in future studies that seek to identify and isolate HSC-producing hemogenic ECs.


In conclusion, our study indicates the existence of two lineage-specific, Sox17-expressing progenitor cell populations during early mouse development. Given that Sox17 is widely used to identify endoderm-like cells during ESC-directed differentiation, our results suggests that Sox17 is not solely an endoderm-specific marker but that it identifies both ventral pancreatic MPCs and a part of hemogenic ECs at E9.5.


We thank Drs. Stacey S. Huppert and Mervin C. Yoder for critically reviewing the manuscript and other helpful suggestions; Susan B. Hipkens and Rama Gangula for establishing and maintaining the mouse line; Travis Clark for RNA amplification and library construction; Anna Osipovich, Judsen D. Schneider, and Anil Laxman for helpful suggestions; the Vanderbilt Transgenic Mouse/Embryonic Stem Cell Shared Resource for performing blastocyst injections; the Vanderbilt Flow Cytometry Shared Resource for assistance with FACS; the Vanderbilt Genome Sciences Resource for performing the RNA-Seq. These studies were supported by NIH Grants DK72473 and DK89523 to M.A.M. and DK072495 to A.G.B.; CA68485 and DK58404 to the Vanderbilt Flow Cytometry Shared Resource; CA68485 and DK20593 to the Vanderbilt Transgenic Mouse/ESC Shared Resource; CA68485 to the Vanderbilt Genome Sciences Resource.


The authors indicate no potential conflicts of interest.