The differentiation of human embryonic stem cells (hESCs) into functional hepatocytes provides a powerful in vitro model system for studying the molecular mechanisms governing liver development. Furthermore, a well-characterized renewable supply of hepatocytes differentiated from hESCs could be used for in vitro assays of drug metabolism and toxicology, screening of potential antiviral agents, and cell-based therapies to treat liver disease. In this study, we describe a protocol for the differentiation of hESCs toward hepatic cells with complex cellular morphologies. Putative hepatic cells were identified and isolated using a lentiviral vector, containing the α-fetoprotein promoter driving enhanced green fluorescent protein expression (AFP:eGFP). Whole-genome transcriptional profiling was performed on triplicate samples of AFP:eGFP+ and AFP:eGFP− cell populations using the recently released Affymetrix Exon Array ST 1.0 (Santa Clara, CA, http://www.affymetrix.com). Statistical analysis of the transcriptional profiles demonstrated that the AFP:eGFP+ population is highly enriched for genes characteristic of hepatic cells. These data provide a unique insight into the complex process of hepatocyte differentiation, point to signaling pathways that may be manipulated to more efficiently direct the differentiation of hESCs toward mature hepatocytes, and identify molecular markers that may be used for further dissection of hepatic cell differentiation from hESCs.
Disclosure of potential conflicts of interest is found at the end of this article.
Author contributions: E.C.: conception and design, provision of study material or patients, collection and/or assembly of data, data analysis and interpretation, manuscript writing, final approval of manuscript; M.E. and Y.X.: conception and design, provision of study material or patients, collection and/or assembly of data, data analysis and interpretation; A.X.: provision of study material or patients; M.K.: provision of study material or patients, collection and/or assembly of data; M.T.M.: financial support; J.S.G.: financial support, data analysis and interpretation; W.H.W.: conception and design, financial support, data analysis and interpretation; J.B.: conception and design, financial support, data analysis and interpretation, manuscript writing, final approval of manuscript.
Every year, more than 30,000 people in the United States die of acute liver failure. There are more than 17,000 individuals on the waiting list for liver transplants but only approximately 4,000 livers are donated for transplantation annually. A renewable and inexhaustible supply of hepatocytes for these purposes would be invaluable to individuals with liver disease. However, to date, culturing primary hepatocytes or differentiating them from embryonic stem cells (ESCs) has proven inefficient and difficult. Overcoming these obstacles will require a greater understanding of the molecular cues governing hepatocyte differentiation as well as a more rigorous analysis of specific hepatic cell lineages and their distinct cellular markers.
The molecular and embryological mechanisms governing fetal liver development have been examined by genetic analysis in the rodent and explant studies in the chick, frog, and fish (reviewed [1, –3]). Liver specification begins as early as gastrulation, upon the formation of the definitive endoderm. During this time, the endoderm is patterned having an anterior-posterior axis, with the most anterior endoderm fated to become foregut derivatives. During early organogenesis, the liver bud forms from the ventral foregut endoderm in response to a combination of positive inductive signals from the cardiogenic mesoderm and repressive signals from the trunk mesoderm [4, 5]. The liver bud is composed of bipotential hepatoblasts that can give rise to both hepatocytes and cholangiocytes. Many studies have sought to characterize the molecular signatures of the various hepatic cells that give rise to the mature organ and play roles in liver regeneration in response to damage. These studies have highlighted the complexity of hepatic cell lineages. Perhaps because the bipotential hepatoblasts undergo a gradual maturation during liver development, few definitive lineage markers have been identified that exclusively distinguish hepatoblasts, hepatocytes, and cholangiocytes. Expression of proposed lineage markers often overlaps among those three lineages, and within an individual lineage expression of a particular gene may vary depending on such factors as developmental stage examined and spatial location within the liver, or vary due to model system examined [6, –8]. Additional complications arise when characterizing the earliest events in liver specification in humans due to the reliance on primary tissue from a genetically heterogeneous population. Therefore, the development of a robust, reproducible in vitro system for dissecting the complex interplay of early human hepatic cells would be of great use.
Human embryonic stem cells (hESCs) can differentiate into a wide array of human cell types and therefore hold promise for the study of early development and cell-based therapies [9, –11]. Recently, reports have demonstrated that hepatocyte markers can be induced upon differentiation of hESCs [12, , –15]. In these reports, the liver-specific gene expression was identified within the total, heterogeneous population of cells that differentiated from hESCs. To gain the most specific insight into the molecular changes driving the differentiation process, analysis should be performed on isolated subpopulations of cells purified with specific lineage markers.
In this report, we provide a genome-wide transcriptional profile of hepatic cells purified from a complex mixture of differentiating hESCs. Using a simple differentiation protocol, we can reproducibly generate cells expressing α-fetoprotein (AFP), a marker associated with endodermally derived tissues such as the fetal liver, and albumin, a marker of hepatocytes. Following transduction with a lentivirus vector that has recently been used to isolate fetal liver progenitor cells  containing the AFP promoter driving enhanced green fluorescent protein (eGFP) expression, we used flow cytometry to isolate the putative hepatic cells from differentiated hESCs. Subsequent transcriptional profiling of the AFP:eGFP-positive (AFP:eGFP+) population versus the AFP:eGFP-negative (AFP:eGFP−) population of cells was performed using the recently released Affymetrix Exon Array ST 1.0 (Santa Clara, CA, http://www.affymetrix.com). Compared with traditional 3′ expression arrays such as the Affymetrix Human U133 Plus 2.0 array, the increased probe coverage of the Exon Array results in more that a fourfold increase in probe density and an eightfold increase in the number of perfect match targets. As a result, the Exon Array ST 1.0 provides improved sensitivity and specificity of absence/presence calls and more accurate quantitative measurement of the level of gene expression . Statistical analysis of the transcriptional profiles demonstrated that the AFP:eGFP+ population is enriched for genes characteristic of hepatic cells. These data provide a unique insight into the process of hepatic cell differentiation. By identifying cell surface antigens, signaling pathways, and developmental regulators expressed in hESC-derived hepatic cells, this work helps establish a foundation to further dissect the molecular mechanisms driving liver specification and improve the efficiency of directing the differentiation of hESCs into hepatic cells.
Materials and Methods
Quantitative Measure of Human Albumin and AFP
Human albumin and AFP levels in the cell culture media were determined by enzyme-linked immunosorbent assay (ELISA). Albumin concentration was determined by the Human Albumin ELISA quantitation kit (Bethyl Laboratories, Inc., Montgomery, TX, http://www.bethyl.com) according to the kit protocol. AFP concentration was determined using a rabbit antihuman AFP (DAKO, Glostrup, Denmark, http://www.dako.com) as a capture antibody, and a rabbit antihuman AFP-HRP conjugate was used for detection. Blocking buffer was 3% bovine serum albumin, 2% sucrose in wash buffer (0.05% Tween-20 [Sigma-Aldrich, St. Louis, http://www.sigmaaldrich.com] in phosphate-buffered saline), and all the steps were done essentially according to the Albumin ELISA kit protocol. Human AFP calibrator was from MP Biomedicals (Solon, OH, http://www.mpbio.com).
hESC Culture and Differentiation
H9 cells were maintained on irradiated primary mouse embryonic fibroblasts in Dulbecco's modified Eagle's medium (DMEM):F12 (Invitrogen, Carlsbad, CA, http://www.invitrogen.com) supplemented with 20% Knockout Serum Replacer (Invitrogen), nonessential amino acids, β-mercaptoethanol, and 8 ng/ml fibroblast growth factor 2 (FGF2) (Peprotech, Rocky Hill, NJ, http://www.peprotech.com). Cells were passed enzymatically using 200 units/ml Collagenase IV (Invitrogen). For differentiation, colonies of hESCs were passed to ultralow adhesion dishes (Corning Inc., Corning, NY, http://www.corning.com) in DMEM supplemented with 20% fetal bovine serum (FBS) to generate embryoid bodies. Media were replaced every other day. On the eighth day of embryoid body growth, the cells were plated onto standard tissue culture dishes coated with 0.25% gelatin in DMEM supplemented with 20% FBS and 100 ng/ml acidic FGF (FGF1) (R&D Systems Inc., Minneapolis, http://www.rndsystems.com). Media were changed daily. Five days after plating, the cells were transduced overnight with lentiviral vectors in media containing 8 μg/ml polybrene. Lentiviral supernatants were prepared as previously described . Cells were fed daily for eight more days then dissociated and sorted by flow cytometry. To dissociate the differentiated cells for fluorescence-activated cell sorting (FACS), the cells were treated for 15 minutes with 200 units/ml collagenase IV and .5 mg/ml dispase, followed by 10 minutes with 0.05% trypsin/EDTA. The dissociated cells were washed once in DMEM + 20% FBS and passed sequentially through 70-μm and 40-μm filters to obtain a single-cell suspension.
Undifferentiated hESC samples were harvested from hESCs grown for 4 days on a 1:20 dilution of growth factor reduced Matrigel (BD Biosciences, San Diego, http://www.bdbiosciences.com) in hESC media conditioned overnight on primary mouse embryonic fibroblasts.
Total mRNA from undifferentiated hESCs, 8-day-old embryoid bodies, and the FACS-sorted AFP:eGFP+ and AFP:eGFP− fractions was purified using the RNA Easy kit from Invitrogen. Probes for the Affymetrix Exon Array ST 1.0 were prepared and hybridized to the array using the GeneChip Whole Transcript Sense Target Labeling Assay (Affymetrix) according to the manufacturer's suggestions. Briefly, for each sample, 1 μg of total RNA was subject to ribosomal RNA reduction. Following rRNA reduction, double-stranded cDNA was synthesized with random hexamers tagged with a T7 promoter sequence. The double-stranded cDNA was used as a template for amplification with T7 RNA polymerase to create antisense cRNA. Next, random hexamers were used to reverse transcribe the cRNA to produce single-stranded sense strand DNA. The DNA was fragmented and labeled with terminal deoxynucleotidyl transferase. The probes of triplicate samples of all samples (H9 passages 33, 36, and 40) were hybridized to the Affymetrix Exon Array ST 1.0 microarrays and scanned. Adult tissue data sets were from the Affymetrix public Human Exon 1.0 ST Array tissue panel data set (http://www.affymetrix.com/support/technical/sample_data/exon_array_data.affx).
Gene expression indexes were calculated using GeneBASE program for Exon Array analysis (http://biogibbs.stanford.edu/∼kkapur/GeneBASE/) . First, background adjustment of raw probe intensities was performed using the MAT model  trained from Exon Array background probes. Probes targeting alternative exons, putative exon predictions, and low-affinity or cross-hybridizing probes were identified and excluded prior to calculating the expression indexes as described .
Quantitative Polymerase Chain Reaction
Quantitative polymerase chain reaction (qPCR) for all genes was performed in triplicate using the Sybergreen system on a Bio-Rad iCycler (Bio-Rad, Hercules, CA, http://www.bio-rad.com). Primer pairs and master mixes were purchased from SuperArray Bioscience Corp. (Fredrick, MD, http://www.superarray.com). For each PCR primer set, a standard curve was generated using 5- to 10-fold dilutions of a Universal RNA template. Ct (threshold cycles) were calculated automatically by the qPCR machine. Relative expression levels for each gene of interest (GOI) were calculated using the standard curves. The housekeeping gene (HKG) PP1A was used as an internal control. Fold differences of genes of interest between AFP:eGFP+ and AFP:eGFP− samples were calculated according to the following equation: [(AFP + Relative GOI)/(AFP + Relative HKG)]/[(AFP − Relative GOI)/(AFP − Relative HKG)].
Formation of Albumin and AFP-Secreting Hepatocytes from hESCs
In the mouse, FGF signaling from the cardiac mesoderm provides essential signals for specification of the ventral foregut endoderm into liver (reviewed ). A previous study by Lavon et al.  showed that acidic FGF was able to increase by approximately 50-fold, the number of albumin-positive cells arising in a 30-day hESC differentiation protocol. Using a modification of their protocol, we conducted a two-step differentiation regime, illustrated in Figure 1. First, we plated clumps of undifferentiated hESCs in suspension onto ultralow adhesion dishes in DMEM containing 20% FBS. This promoted the formation of embryoid bodies (EBs), which contain cells of the endoderm, mesoderm, and ectoderm lineage. After 8 days, the EBs were plated down onto standard tissue culture dishes coated with gelatin in DMEM containing 20% FBS supplemented with 100 ng/ml acidic FGF (FGF-1).
To examine whether this differentiation regime gave rise to hepatic cells, conditioned media were collected for 6 weeks after plating down the EBs and analyzed by ELISA for the presence of the hepatic proteins AFP and albumin. Both proteins were secreted in the media at high levels. As shown in Figure 1, peak levels of AFP were detected approximately 2 weeks after plating embryoid bodies onto gelatin-coated dishes. Albumin secretion reached the highest levels approximately 3 weeks following plating. Thus, the cells are capable of secreting liver-specific proteins.
Although the cellular population secreted high levels of liver-specific proteins, we wanted to isolate and examine the individual cells within the population that were expressing AFP and albumin. To this end, we used lentivirus reporter vectors containing the AFP promoter driving eGFP and the albumin promoter upstream of either eGFP or the RFP gene. These vectors were recently used to mark and isolate liver progenitors from primary human fetal liver tissue . Differentiating hESCs were transduced 5 days after plating the EBs.
We observed AFP promoter activity in two contexts. The most common appearance of eGFP driven by the AFP promoter occurred within densely packed multicellular layers of small cells resembling hepatoblasts (Fig. 2A, 2B.). AFP promoter activity was also detected in stratified clusters of cells (Fig. 2C, 2D). These clusters of cells closely resemble the bile duct units formed by bipotential mouse embryonic liver cells grown on Matrigel . Albumin promoter activity was also detected in multiple cell types. Similar to AFP, albumin promoter activity was detected in the periphery of multicellular layers of cells and in stratified cell clusters (Fig. 2E, 2F). In addition, albumin was detected in monolayers of cells resembling hepatocytes (Fig. 2G, 2H). In conclusion, this differentiation protocol gives rise to three morphologically distinct populations of cells that express the liver markers AFP and albumin.
Isolation and Expression Profiling of hESC-Derived Hepatic Cells
Once we established the conditions for fluorescently tagging cells expressing hepatic cell markers, we used flow cytometry cell sorting to purify the AFP:eGFP+ cells from the AFP:eGFP− cells. The AFP:eGFP lentivirus construct was chosen for this study over the albumin construct for two reasons. First, we sought to examine the earliest events in hepatic cell differentiation with the hope of identifying early, hepatic stem cell populations. Since albumin was considered a marker of more mature liver cells, we hypothesized that the albumin promoter may mark a more mature cell population. Second, the albumin construct typically fluorescently labeled fewer cells than the AFP construct, perhaps due to differences in the endogenous gene expression or differences in the strength of the transgenic promoter constructs.
Twenty-one days after cells were differentiated and transduced with the AFP:eGFP lentivirus construct, cells were dissociated and sorted based on eGFP expression using flow cytometry. Cells expressing detectable eGFP by flow cytometry were considered the AFP:eGFP+ fraction, the remainder of cells considered AFP:eGFP−. Typically the AFP:eGFP+ fraction comprised between 1%–10% of the total cell population and was more than 85% pure following FACS sorting. Immunohistochemistry on the AFP:eGFP+ and AFP:eGFP− populations with an antibody against the human AFP protein confirmed that the AFP protein was detectable only within the AFP:eGFP+ population (supplemental online data 1). mRNA from triplicate biological replicates of each fraction was isolated, used for probe synthesis, and hybridized to the Affymetrix Exon Array ST 1.0. Gene expression indexes were calculated as described in Xing et al.  for the 400,000 background adjusted full probe sets resulting in 18,495 gene indexes.
Genomic Data Are Validated by qPCR
To validate the accuracy of the gene indexes calculated from the Exon Array hybridization data and to assess the robustness of the statistical predictions, the expression levels in the AFP:eGFP+ and AFP:eGFP− fractions of 12 genes were verified by qPCR. The 12 genes tested encompass a broad spectrum of predicted expression levels and fold differences between the AFP:eGFP+ and AFP:eGFP− fractions. As Figure 3 shows, the predicted fold differences agreed between the Exon Array and qPCR analysis for all 12 genes examined. These data demonstrate that our gene index calculations from Exon Array data accurately represent the expression of genes with a wide range of expression levels, supporting previous work indicating that Exon Arrays provide highly sensitive and accurate quantitative measurements of the levels of gene expression .
Comparison of AFP-Positive Cells to Tissue Shows Strong Similarity to Liver
One of the primary goals of this study was to isolate human hepatic cells differentiated from hESCs. Although AFP is an early hepatocyte marker, it is also expressed in other fetal tissues including the yolk sac and kidney. Therefore we sought to test whether there was a significant enrichment of liver-specific transcripts in the transcriptome of the AFP:eGFP+ cells. To identify biological difference between the AFP:eGFP+ and AFP:eGFP− transcriptomes, we first compared the gene indexes by Gene Set Enrichment Analysis (GSEA) . GSEA is thought to be an improvement over traditional gene ontology (GO) term enrichment analysis since GSEA examines the entire data set, whereas GO analysis typically requires a preselected list of differentially regulated genes using an arbitrary cutoff. GSEA with the C2 gene set (containing 522 gene sets participating in specific metabolic and signaling pathways from manually curated databases) and the C4 gene set (containing 427 gene sets defined by expression neighborhoods centered on cancer-related genes) identified 20 gene sets as significantly upregulated in the AFP:eGFP+ fraction (supplemental online data 2). When GSEA identifies enrichment of multiple gene sets, additional biological insight can be gained by “leading-edge” analysis. The leading edge subset is the core of a gene set that contributes most to the enrichment score in GSEA . Biologically important subsets of genes can be identified by examining the genes shared in leading edge subsets . From the 20 gene sets upregulated in the AFP:eGFP+ fraction, 193 genes contributed to the leading edge subsets. Sixty-nine of the genes in the leading edge subsets are shared by two or more gene sets. Examination of the expression levels of those genes in a panel of 11 human tissues revealed that 66 of the 69 genes have their highest expression in the adult liver (Fig. 4). The intensity of the heatmap within the liver reflects the fact that these genes are much more highly expressed in the liver relative to any of the other samples and strongly suggests that the AFP:eGFP+ fraction is enriched for liver gene expression.
A large proportion of the genes found in the fetal liver are also expressed in the yolk sac, including AFP. This may be due to their shared germ layer lineage (the gut tube is formed from the continuous sheet of embryonic endoderm lining the yolk sac), or similar biological function in early development. However, neither the GSEA gene sets nor the Exon Array data sets include human yolk sac samples. Therefore, to compare the transcriptional profile with genes expressed within the yolk sac, we examined the gene expression data in the Gene Expression Database (GXD) of the Jackson Laboratory Mouse Genome Informatics database . The GXD can be used to identify gene expression information throughout mouse development and includes details regarding the ages analyzed and assays used for determining gene expression. The data are curated from the published literature, and the assays to determine gene expression include immunohistochemistry, Western blots, Northern blots, RNA in situ analysis, RNase protection assays, and reverse transcription-PCR. A query for genes expressed in both the embryonic liver and the yolk sac identified 126 genes. In contrast, only five genes were identified as being expressed in the fetal liver of the mouse but not in the yolk sac. All five of these genes were also expressed in the AFP:eGFP+ cells (Table 1). For this analysis, a gene was considered expressed in the AFP:eGFP+ fraction if the average gene index from the three biological replicates was greater than 100. Furthermore, the GDX database query identified six genes that were expressed in the yolk sac but not expressed in the fetal liver. Only one of these genes, TEK (tyrosine kinase, endothelial), was expressed within the AFP:eGFP+ cells. Thus the AFP:eGFP+ cells showed liver-specific gene expression in 10 of the 11 genes differentially expressed between the mouse embryonic liver and yolk sac. A similar analysis was performed using Unigene's EST Profile Viewer (http://www.ncbi.nlm.nih.gov/sites/entrez). Expressed sequence tags for the four liver-specific genes found in the mouse analysis (Table 1; GRB2, HELLS, LGALS3, SLC20a2) demonstrated expression within the adult human liver. Furthermore, the yolk sac-specific genes (CITED1, ERG, MKX, and TEK) are absent from adult human liver (HAMP is present in adult liver, and there are no data for S100g).
Table Table 1.. Comparison of fetal liver- and yolk sac-specific genes
Finally, we compared our data with the gene expression analysis of definitive and visceral endoderm described by Sherwood et al. . In their work, they detected a large number of transcription factors with > 3-fold enrichment in visceral endoderm compared with definitive endoderm isolated from the stage embyronic day-8.25 mouse. Examination of 30 of the visceral endoderm enriched transcription factors that Sherwood et al. identified indicated that in our array data, 22 of those genes are expressed in the human liver, highlighting once again the similarity between those tissues. Of the eight visceral endoderm enriched genes identified by Sherwood et al. that we found were not expressed in the adult liver, three genes (Hoxb8, Nfatc1, Twist1) were present in both AFP:eGFP+ and AFP:eGFP− samples, but not enriched in either. The remaining five visceral endoderm enriched transcription factors (Cited1, Vdr, Lhx1, Sox7, Tfec) were absent from the AFP:eGFP+ cells. Thus overall this analysis supports the hypothesis that the AFP:eGFP+ cells are hepatic, not yolk sac, in origin.
Genes Enriched in AFP:eGFP+ Cells Compared with 14FP:eGFP− Cells
To gain deeper insight into the genetic differences between the AFP:eGFP+ and AFP:eGFP− cells, we performed a comparison of gene expression in the AFP:eGFP+ versus the AFP:eGFP− fractions using the program Significance Analysis of Microarrays (SAM) . Analysis was performed both with and without a log transformation of the unfiltered gene expression indexes. As expected the log-transformed analysis identified more genes with a low expression but high fold difference between the two samples, and the analysis using the raw gene indexes tended to identify genes with high expression values and lower fold difference between the two samples. The log-transformed analysis resulted in the identification of 472 genes whose expression was enriched in the AFP:eGFP+ fraction (supplemental online data 3). Two hundred nine genes were identified without the log transformation of the gene indexes. Seventy-two genes were identified by both methods, resulting in a total of 609 genes identified by SAM as being enriched in the AFP:eGFP+ fraction with a false detection rate of 0.14.
Unsupervised hierarchical clustering of the 609 genes enriched within the AFP:eGFP+ cells was performed with the dChip program  using the expression indexes of undifferentiated hESCs, human ESC-derived embryoid bodies, the AFP:eGFP+ and AFP:eGFP− cells, and a panel of 11 human tissues (supplemental online data 4). The largest cluster contains 163 genes whose expression is highest in the adult liver, further supporting the hypothesis that the AFP:eGFP+ cells are hepatic. This cluster includes genes such as albumin, transferrin, thrombin, transthyretin, vitronectin, hepatic lipase, fibrogen-alpha, -beta and -gamma, and eight members of the apolipoprotein family. In addition, three cell surface receptors for the hepatitis C virus, claudin one , CD81 [28, 29], and LDLR [30, 31], were found to be highly expressed in the AFP:eGFP+ cells, raising the possibility that these cells may be used for the in vitro study of hepatitis C infection.
Manual inspection of the 609 genes determined by SAM as enriched in the AFP:eGFP+ fractions identified numerous transcription factors and signaling molecules known to play a significant role in liver development (supplemental online data 5). Among the transcription factors, FOXA1, FOXA2, FOXA3, HNF4A, TCF2, PROX1, and CEBPA were all enriched in the AFP:eGPP+ cells, all of which are known to play important roles in liver development. Several genes involved in Wnt/β-catenin signaling were found to be enriched in the AFP:eGPP+ cells including FZD5, DACT2, and DKK3, consistent with the recent studies linking Wnt signaling with liver development. Other signaling molecules implicated in endoderm and liver development that were enriched in the AFP:eGFP+ cells include BMP2, FGFR4, KITL, and HABP2. Genes involved in signaling that have not previously been associated with liver development include CER1, COBL, and GMCL1. Finally, although the HGF receptor MET was enriched in the AFP:eGFP+ cells, HGF was not. Instead HGF was expressed at a higher level in the AFP:eGFP− cells, suggesting it plays an endocrine or paracrine role in this system.
Although examination of different tissue types clearly yields a gene signature consistent with liver, we were interested whether the AFP:eGFP reporter enriched for a specific hepatic cell lineage. We conducted a review of the literature and compiled a list of genes used to distinguish between hepatic stem cells, hepatoblasts, cholangiocytes, and mature hepatocytes. Although such a review is confounded by the fact that many of the reports vary in the embryonic stages examined, the methods of analysis, and experimental systems used, a small number of markers appeared useful for distinguishing the different lineages. A summary of our analysis is shown in Table 2. For this analysis, a gene was considered expressed in the AFP:eGFP+ fraction (denoted “+” in Table 2) if the average gene index from the three biological replicates was greater than 100. Genes that were identified by SAM as enriched in the AFP:eGFP+ fraction compared with the AFP:eGFP− fraction are listed as “enriched.”
Table Table 2.. Hepatic cell lineage markers
The recently identified hepatic stem cells are distinguished from hepatoblasts by their expression of NCAM and claudin 3, with an absence of expression of AFP . Mature hepatocytes can be distinguished from hepatic stem cells and hepatoblasts by the expression of dipeptidyl peptidase 4, CYP3A4, and Serpin A1 . Cholangiocytes are distinguished from the hepatic stem cells, hepatoblasts, and hepatocytes by their expression of cytokeratin 7 (also known as ck7 or KRT7) and their high expression of cytokeratin 19 (also known as ck19 or KRT19). As shown in Table 2, the AFP:eGFP+ cells were enriched for expression of the hepatoblast marker AFP, the mature hepatocyte marker dipeptidyl peptidase 4, and the cholangiocyte marker KRT7. The AFP:eGFP+ cells also expressed the hepatic stem cell markers NCAM and claudin 3. Therefore, following our differentiation protocol, purified AFP:eGFP+ cells express genes found in all three early hepatic cell types.
The first goal of this study was to determine whether the AFP promoter driving eGFP expression could be used to mark and isolate hepatic cells. Our data indicate that the reporter marks cells of hepatic origin as opposed to other tissues with AFP expression such as the yolk sac. Therefore, this system may be used as a genetically defined, reproducible in vitro system for dissecting the complex interplay of early human hepatic cells.
The second goal of this study was to examine the transcriptional profile of isolated AFP:eGFP+ cells in hopes that those data would shed light on the complex genetic regulation underlying liver development and identify additional cell markers that could be used to further define the cell lineages arising during hepatic cell differentiation. Using the recently released Affymetrix Exon Array ST 1.0, which produces improved gene-level expression measurements compared with standard 3′ arrays , we have generated the first detailed whole genome transcriptional profile of a purified population of hepatic cells differentiated from hESCs. Comparison of the transcriptional profiles from the AFP:eGFP+ cells and AFP:eGFP− cells reveals several cell surface markers that may be used for lineage analysis such as TACSTD1 (also known as EpCAM), FGFR4, and HAVCR1. In addition, some previously identified hepatic cell surface markers such as KRT19 and FGFR1 were determined not to be enriched in the AFP:eGFP+ cells compared with the AFP:eGFP− cells, indicating that alone they would not be useful for purifying hepatic cells, but may be used in combination with other markers to identify complex subpopulations of cells.
Given that three unique morphologies were observed in the AFP:eGFP+ cells (Fig. 2) following this differentiation protocol, and the gene expression signature containing genes expressed in hepatic stem cells, hepatoblasts, cholangiocytes, and mature hepatocytes, we propose that following 21 days of differentiation of hESCs, the AFP:eGFP reporter enriches for these hepatic cell lineages. Therefore, we conclude that this system may be used for future studies aimed at dissecting those lineages and analyzing the genetic programs regulating the differentiation of the bipotential hepatoblast into either cholangiocytes or hepatocytes.
Recent reports have described methods for differentiating hESCs into hepatic cells directly from monolayers of hESCs [14, 15]. Preliminary work in our lab indicates that monolayers of binucleated cells resembling hepatocytes can be reproducibly generated by treating monolayers of hESCs with Activin A in low serum followed by acidic FGF. However, under these conditions, the varied morphologies seen in Figure 2 are never observed. In contrast, the embryoid body/acidic FGF differentiation protocol described here results in multilayered cells with a wide spectrum of cellular morphologies, including spontaneously contracting cardiomyocytes. Future studies will be needed to determine whether the hepatic cells arising within the complex cellular environment generated during the differentiation scheme described in this paper are functionally or developmentally different from those differentiating from the monolayer differentiation protocols.
One of the benefits of whole genome transcriptional profiling is the ability to associate genes with new biological processes based on the correlation of their expression with the expression of genes previously shown to play a role in specific biological processes. To highlight, Figure 5 shows a heatmap of 10 genes enriched in the AFP:eGFP+ cells that previously were not associated with hepatic cell differentiation along with 20 other genes with well-established roles in liver development. Using the expression data from undifferentiated hESCs, embryoid bodies, AFP:eGFP+ cells, and adult liver, we can see how gene expression changes during the differentiation process and formulate hypotheses regarding when the novel genes may play a significant role. Genes such as TDGF1, SOCS2, CER1, GRHL2, and DACT2 with their highest expression in undifferentiatied hESCs may play a role in maintaining the developmental potential or plasticity of the early hepatic cells. The transcription factors SOX17, and HNF1B, which are expressed most highly in the embryoid bodies that contain only early endoderm, play a role primarily in the initial specification of that germ layer. Genes expressed most highly in the AFP:eGFP+ cells such as CDX2, KITG, and DKK3 are predicted to be most important in the fetal hepatic cell specification. Furthermore, the cell surface markers HAVCR1, CDH17, and KRT7 would be potential candidates to help further purify and characterize the developing hepatic cell types. Finally genes whose expression is highest in the adult liver such as NRG1, RA12, NTN4, KLF4, GMCL1, MET, PROX1, CEBPA, and HNF4A would be predicted to play significant roles in the adult liver. Although the published literature supports the proposed roles for many of the genes described above, others such as TDGF1, CER1, GRHL2, DACT2, DKK3, NRG1, RA12, NTN4, KLF4, and GMCL1 have not previously been identified as playing a role in hepatic development and therefore warrant further investigation.
Disclosure of Potential Conflicts of Interest
The authors indicate no potential conflicts of interest.
We thank the Baker and Wong laboratories for valuable discussions and comments throughout the course of this work. This work was supported by the generosity of the Donald E. and Delia B. Baxter Foundation (J.B.), the National Institutes of Health (HD41557; J.B. and HG003903; W.H.W.), and the California Institute of Regenerative Medicine (CIRM). E.C. was funded by National Institutes of Health Training Grant HG000044. M.T.W.'s current address is StemCells Inc., Palo Alto, CA.