Author contributions: J.B.: conception and design, manuscript writing; J.C.I.B.: conception and design, financial support, manuscript writing.
Disclosure of potential conflicts of interest is found at the end of this article.
First published online in STEM CELLSEXPRESS December 23, 2011; available online without subscription through the open access option.
The state of a cell is defined by the genes it transcribes and the epigenetic landscape that regulates their expression. Pluripotent cells have markedly different epigenetic signatures when compared with differentiated cells. Permissive chromatin, high occurrence of bivalent domains, and low levels of heterochromatin allow pluripotent cells to react to distinctive stimuli and undergo changes of cell state by differentiating into various tissues. Differentiated cells can be reprogrammed by a set of transcription factors to induced pluripotent stem cells (iPSC) that convert their transcriptional and epigenetic state to pluripotency and thus closely resemble embryonic stem cells (ESC). However, questions remain on whether the epigenetic reprogramming is complete or if there are some recurring iPSC specific aberrations that impede their full pluripotency potential. For this reason, iPSC need to be closely compared with ESC, which is used as a golden standard for in vitro pluripotency. Transcribed genes, epigenetic landscape, differentiation potential, and mutational load show small but distinctive dissimilarities between these two cell types. STEM CELLS2012;30:33–41
In 2006, the researchers Takahashi and Yamanaka opened a completely new venue in stem cell research by showing that the forced expression of only four transcription factors (Oct4, Sox2, Klf4, and c-Myc) was sufficient to convert fibroblast cells into embryonic stem cell (ESC)-like cells, which were named induced pluripotent stem cells (iPSC) . Many subsequent articles have since confirmed that the timed expression of master regulator factors can change differentiated cells into iPSC, a process called reprogramming. By now a variety of starting cell types, different combinations of main transcription factors and different delivery techniques of these factors into the cells have been used successfully for this.
Reprogramming starts by the binding of few master pluripotency transcription factors to regulatory elements of many genes, thereby affecting their expression. The epigenetic landscape of somatic cells is refractory to the total control of these transcription factors, but their prolonged expression and positive-feedback regulatory loops  slowly modify the epigenetic landscape and new pluripotent circuits are being established, changing the cell state. When looking at the total population of cells expressing reprogramming factors in a limited time frame, the reprogramming process is highly inefficient. It depends on poorly understood stochastic events in the cells and requires cell division [3, 4]. The final result is an iPSC colony with newly restructured epigenetic marks driving expression of the endogenous transcription factors and chromatin regulators that further sustain and balance the achieved pluripotent state. However, questions remain whether this drastic cell change leaves epigenetic “scars.” A closer inspection of the epigenetic state, stability, mutational load, and full developmental potential of iPSC is still in progress.
Pluripotent stem cells allow the study of embryonic development and cell differentiation and offer much hope for regenerative medicine. Besides once again proving cell plasticity, iPSC have drawn special attention to both the public and scientific community because they avoid the handling of embryonic material and can be patient tailored. But how close are they really to the gold standard ESC, and related to it—how “safe” are they for future clinical use? Avoiding the integration of reprogramming factors into the genome during iPSC generation is already achieved using different methodologies [5–9]. This is a first important step toward safer cells. The goal of this review is to provide an up-to-date overview of the epigenomic, transcriptional, and genomic states of iPSC, together with their differentiation potential, by comparing iPSC with ESC and to the somatic cells that they are derived from.
SETTING THE STANDARDS FOR IPSC PLURIPOTENCY
Thoroughly erasing the differentiation specific epigenetic marks in iPSC and returning to the ESC pluripotency “ground state” is believed to give the highest chance for successful subsequent differentiation. This is corroborated by the fact that iPSC are generally less successful in generating high percentage chimeras and even less efficient in generating live mice in tetraploid complementation when compared with ESC .
During reprogramming, often a number of colonies appear, including ones that are highly proliferative but not pluripotent. The first selection for a “good quality” iPSC colony is usually done by morphology criteria . The morphological appearance, proliferation rate, the reactivation of endogenous pluripotency genes followed by silencing of transgenes used for reprogramming, and the ability to form teratomas are some of the basic criteria a cell line has to meet to be considered true iPSC. Furthermore, when injected into blastocysts they should contribute to the embryo tissues, including the germ line. Ultimately, the ability of iPSC to form a whole animal via tetraploid complementation is a clear indication of iPSC pluripotency and a nearly identical state to ESC [12, 13] (Fig. 1). The problem is that such rigorous pluripotency tests are difficult to perform routinely on many lines.
In human, the most rigorous tests for pluripotency can not be performed for obvious ethical reasons. This lowers the standard for pluripotency and increases the heterogeneity of obtained iPSC lines. Even differences in the culturing conditions between different labs can contribute to the heterogeneity of the lines [14, 15]. One example is the X chromosome inactivation in human iPSC lines of female origin. There are reports of “ground state” lines, where both X chromosomes are again active [16, 17], while others show persistence of X chromosome inactivation . But there is a general ambiguity associated with the human pluripotent lines isolated so far. Namely, human ESC/iPSC share several important features with mouse stem cells isolated from postimplantation embryo epiblasts, called epiblast embryonic stem cells (EpiESC) . Epiblast stem cells present the next stage in development and therefore have a more limited developmental potential. They show poor success in generating chimeras and can manifest expression of early lineage commitment markers [20, 21]. Thus, the similarity of mouse EpiESC to human ESC/iPSC, together with the limited pluripotency tests available in human lines, raise questions on whether those cells are capable of producing whole embryos and about their general level of pluripotency. The possibility of directly converting human pluripotent cells into mouse-like ESC is tempting. So far it has been achieved by constitutive expression of transcription factors, producing either metastable cells without proper epigenetic activation of major pluripotency regulators , or cells stable for only several passages . Possibly, the optimal isolation and culture conditions required for human ESC culture have not yet been met. Alternatively, the observed differences in human and mouse ESC simply reflect the intrinsic species differences.
TRANSCRIPTIONAL COMPARISON OF ESC and iPSC
The transcription profiles of good quality iPSC and ESC are nearly identical. Chin et al.  showed that a small group of genes is continuously differentially expressed between several iPSC and ESC lines. Even though those genes couldn't be categorized by gene ontology analysis to the same functional group, they could point to iPSC as being a distinct subtype of pluripotent cells. In contrast to this finding, two other groups compared iPSC lines with slightly different statistical algorithms and found that some difference between iPSC and ESC expression profiles does exist, but is not consistent through all the lines and points rather to different laboratory culture conditions [14, 15]. Also, focused profiling on only miRNA expression does not segregate iPSC from ESC . Therefore, it seems that iPSC do not form a different new class of pluripotent stem cells distinct from ESC in their gene expression signature. Or if they do, the difference can not be pinpointed by transcriptome analysis because of the high noise in existing gene expression data and the possible heterogeneity in the quality of the tested iPSC lines .
However, when one looks at the individual reprogramming experiments instead of focusing on all differentially expressed genes between multiple iPSC lines and ESC, a statistically significant difference and logic can be seen. Namely, the common features of the deviant transcription come from (a) iPSC not efficiently silencing the expression pattern of the somatic cell from which they are derived and (b) failing to induce some ESC specific genes to the level of expression in ESC, akin to epigenetic memory [24, 27, 28] (Fig. 2).
By using ESC and iPSC with identical genetic background and reprogramming factors integrated into the same genetic locus, it is possible to minimize the genetic and reprogramming methodology “noise” and to concentrate exclusively on the intrinsic differences between the two pluripotent cell lines . Surprisingly, mouse iPSC and ESC obtained in this way have only two differentially expressed transcripts—non-coding RNA Gtl2 and small nucleolar RNA Rian. They localize to the imprinted Dlk1-Dio3 gene cluster on mouse chromosome 12 and are maternally expressed. Its aberrant regulation is implicated in murine impaired development . Epigenetically, the locus is fully methylated in many iPSC lines, while some lines have only one allele silenced, as is the case in ESC. Functionally, it seems that iPSC with the Dlk1-Dio3 locus fully silenced can not form tetraploid complementation animals, and chimerism is also significantly lower when doing blastocyst injections, when compared with ESC [10, 30].
In human iPSC, the Dlk1-Dio3 locus is not silenced, suggesting a different iPSC state/reprogramming. It would be interesting and useful to find similar marker in human cells. The search for such marker is convoluted by the possibility that epigenetic memory or aberrations during reprogramming may affect some genes which are not expressed in the pluripotent state, but whose expression would be relevant during differentiation.
Hence, in reprogramming experiments a wide palette of different quality iPSC lines have emerged. Clearly, the correlation between Dlk1-Dio3 imprinting and a high degree of pluripotency needs more research. If confirmed, the strong advantage of the Dlk1-Dio3 test lies in the fact that instead of having a diverse panel of pluripotency tests, this one is rather simple and technically manageable in most laboratories. Thus, even though reprogramming seems to be stochastic, there are some defined milestone steps that need to be taken sequentially, and directly analyzing for the final step(s) allows for a more simple and focused analysis in order to select true iPSC (Fig. 1).
DIFFERENTIATION POTENTIAL OF iPSC
Another way to test the pluripotency of iPSC is by controlled in vitro differentiation. This is particularly true in case of human iPSC where their contribution to embryo formation can not be tested. Despite the recent report of a potential immunological response to iPSC in mice , directed differentiation with relatively high efficiency and production of functionally adequate cells are the crucial preliminary steps necessary for their future clinical use. There is a plethora of articles describing the potential of iPSC to differentiate into a particular cell type, including cardyomyocytes, neurons, hematopoietic progenitors, endothelia, osteoclasts, hepatocyte-like cells, islet-like cells, and retina. iPSC have passed these tests of differentiation and again defended their pluripotency status. But are they equivalent to ESC?
The direct comparison of the differentiation potential of various cell lines can be difficult. As different laboratories use different culture conditions and/or differentiation protocols, the lines can be compared only in the same work. If there is a significant variation from experiment to experiment the best comparisons are done with all the lines differentiated in parallel. This poses a problem when working with a large number of lines. Finally, evaluating the outcome of differentiation can be approached in different ways. One way is to score for the efficiency of the differentiation, that is, the quantity of cells obtained with a particular differentiation marker. Another important parameter is the quality or the identity of the final differentiated cell. This requires detailed tests for as many cell specific functions as possible. One example is the differentiation into neuronal cells where the full characterization of obtained neurons is still poorly addressed. In the end, it needs to be stressed that overall optimized differentiation protocols are still lacking. Although already more than a decade has passed since hESC were first established, there are few reproducible protocols that give functionally transplantable cells and that could be used as standards to compare the differentiation potential of pluripotent lines.
However, using available protocols, a side-by-side comparison of iPSC with ESC counterparts shows certain variations (Table 1). iPSC show either equal performance to ESC or in some cases inferior performance, especially when comparing the efficiency of their turnover into differentiated cells. Surprisingly, taking into account their degree of characterization—that is, the measures taken to work with “good quality” iPSC, or (although here the data is much more scarce) transgene free cells—there seems to be no correlation with the differentiation efficiency or the quality of the final cells . This aspect of occasional iPSC low performance can perhaps be explained by the fact that the differentiation protocols are mainly established with ESC. Additionally, epigenetic memory and aberrations might make some iPSC more refractory to external differentiation signals. It also has to be taken into account that adopting the cells for in vitro culture can already elicit certain aberrations in the cell state. The in vitro derived ESC used as a pluripotency standard are thus somewhat artificial and also showed significant variation in the differentiation potential between themselves [53, 54].
Table 1. In vitro differentiation potential of various human and mouse induced pluripotent stem cell lines compared with the embryonic stem cell as a standard
The degree of differentiation deviance of some iPSC stresses the need of having robust and relatively simple tests to screen the iPSC. Recently, such an attempt has been made by comparing the DNA methylation, transcriptome and spontaneous in vitro differentiation potential of a pellet of human ESC and iPSC. By doing so, the authors developed so called “scorecards” against which any pluripotent cell line can be checked to measure its potential to differentiate toward a particular lineage . Tests like that can, in a reasonable experimental setting, select among various pluripotent lines the most receptive one for a particular use.
EPIGENETIC COMPARISON BETWEEN iPSC and ESC
Detailed insight into epigenetic differences between iPSC and ESC was made possible by the development of high-throughput sequencing technologies and by the generation of single-nucleotide genome-wide maps of DNA methylation.
The DNA methylation pattern is very similar between iPSC and ESC when compared with nonpluripotent lines, such as fibroblasts. However, hierarchical clustering performed on the methylation level of cca 66,000 CpG sites, besides clearly clustering fibroblasts from ESC/iPSC, also distinguishes iPSC from ESC . One analysis on the whole genome scale found 71 differentially methylated regions (DMR) between three iPSC lines and three ESC lines (and 2,179 between fibroblasts and iPSC) . Almost half of the DMRs show incomplete epigenetic reprogramming of the differentiated cell-of-origin genome, which is in agreement with the gene expression data  and epigenetic memory . However, not all the DMR belong to the cell-of-origin memory, indicating that iPSC also accumulate novel aberrant epigenetic states [57, 59].
Compared with the ESC standard, both hypermethylated and hypomethylated CpG sites are found in iPSC, but the balance is tipped toward hypomethylated CpGs . This indicates that rather than the absence of an appropriate DNA demethylase (for example oocyte enriched) there is an inefficient methylation (or instruction of methylation) during the reprogramming . These CpG methylation aberrations are not transient because they are observed in high passage number iPSC and are transmitted with high frequency through the differentiation to trophoblasts .
During reprogramming, iPSC regain non-CpG methylation, which is specific for pluripotent ESC. Also, several regions with aberrant methylation can be found (again mainly in the form of absence of the methylation mark when compared with ESC). Curiously, non-CpG aberrations are rather big, around 1 Mb, and are proximal to centromeres and telomeres .
Regarding histone methylation, there are few reports unable to find significant genome-wide differences between iPSC and ESC lines [14, 24, 61]. Using CHIP-on-CHIP, Chin et al. analyzed H3K27me3 and H3K4me3 around the promoters (-5.5 to +2.5 kb from transcription start site) of 17,000 genes. Guenther et al.  did a more comprehensive CHIP-Seq analysis covering the whole genome of six iPSC and 6 ESC lines and did not observe significant variation to discriminate ESC from iPSC. Both types of cell lines showed characteristic pluripotent epigenetic landscapes with decreased H3K27me3, H3K4me3 enriched at promoters of actively transcribed genes and both of the marks on many bivalent domains. The variation in gene occupancy of those histone markers were the same between different iPSC and ESC lines as the variations within the same cell group.
Another comparative analysis was done on histone three lysine methylation by CHIP-Seq, with an improved computational method to detect not only peaks, but also long stretches of these marks . This analysis highlighted the difference between human fibroblasts and pluripotent cells by showing that fibroblasts have more repressive chromatin due to the large expansion of H3K27me3 and H3K9me3 repressive marks. The study also demonstrated further differences between hiPSC and ESC. Namely, even though iPSC are more similar to ESC, they have longer repressive domains in H3K27me3 and H3K9me3, similar to fibroblasts. In addition, differentially marked genomic regions are mainly in nongene areas (88% for H3K9me3 and 79% for H3K27me3), raising the question of the functional significance of the main body of those epigenetic marks. H3K9me3 modification is less represented throughout the genome, but it shows more differences between iPSC and ESC, mostly being present in “unique” regions in iPSC, that is, regions not characteristic for ESC or fibroblasts . The same group also associated differences in the gene expression analysis by Chin et al. to the H3K9me3 mark, rather than to the H3K27me3 mark.
EPIGENETIC MEMORY OF iPSC
By now, iPSC have been derived from many different somatic cell types including fibroblasts, keratinocytes, B-lymphocytes, stomach cells, and hepatocytes [5, 63–65]. Are there differences between lines based on their cell of origin? The analysis of gene expression profiles of various human iPSC lines supports this hypothesis by showing significant and persistent donor-cell gene expression in iPSC [24, 27, 28]. So far, all experiments point to three major characteristics when focusing on the distinction between iPSC and ESC. One is the aberrant silencing of somatic genes in cells undergoing reprogramming, another is the weak activation of ESC specific pluripotency genes and the third are unspecific aberrations distinct from either the cell of origin or ESC (Fig. 2). The first two groups led to the belief that there is an epigenetic memory present in the iPSC.
Furthermore, mouse iPSC derived from distinct tissues had marked differences in the frequency of teratoma formation, when differentiated into neurospheres and transplanted into the brain . An explanation for this observation is not yet clear, but could lie in aberrant epigenetic memories of iPSC that reflect different epigenetic states of the donor cells.
Recently, two articles analyzed the epigenetic memory of mouse iPSC in more detail [58, 67]. iPSC derived from the same cell of origin can be clustered together on the basis of their gene expression and DNA methylation. More interestingly, there is a functional significance of the donor cell gene expression, where iPSC differentiation back into the cell of origin brings an advantage over iPSC differentiation into an unrelated lineage [58, 67]. The epigenetic basis of this memory is linked either to DNA methylation  or to histone modifications .
Importantly, all the analyses in the two articles were performed with iPSC with low passage numbers (p4–p6). It seems that reprogramming takes longer than previously thought, and goes on for several passages even after the appearance of ESC morphological features and the expression of pluripotency markers. Possibly, it occurs through a more passive and cell division dependent resetting of the epigenetic cell state. This is corroborated by several facts—higher passage number iPSC (p16 in mice) lose the differences in gene expression and can no longer be clustered together by their cells of origin [24, 67]. The preferential differentiation capabilities of iPSC to its tissue of origin are dispersed either by longer passaging , chemical treatment influencing epigenetic machinery (5-azacytidine, Trichostatin A), or sequential differentiation-reprogramming cycles into the desired direction .
One important point arising from those two articles is the possibility to use the epigenetic memory in order to obtain cells whose differentiation protocols are not yet optimized, as is the case for blood differentiation in human. However, care must be taken because ESC still differentiated more efficiently to blood precursors than early passage blood iPSC in mice . Thus, at the moment the reported epigenetic memory brings rather a disadvantage in the differentiation to any other lineage different from its origins. Early-passage iPSC may not have acquired ESC-like responsiveness to react to differentiation clues. Nevertheless, it remains to be shown whether partially reprogrammed iPSC could be stabilized into a state that has an epigenetic memory of origin and additionally, whether they harbor a certain plasticity that in combination will give an advantage in differentiation capabilities, compared with ESC. Supporting this idea, a recent article reported epigenetic memory in human iPSC lines derived from retinal pigmented epithelial cells . Several iPSC lines differentiated back into their cell of origin with 5- to 10-fold higher efficiencies compared with ESC. iPSC with early passage numbers were not required for this memory.
MUTATIONAL LOAD OF iPSC
Besides epigenetic aberrations it is reported that iPSC also bear genomic mutations [69–72]. These could arise from the reprogramming itself and from the in vitro expansion of cells afterward. So far iPSC reprogramming is a very inefficient process, where in the end just few single cells get reprogrammed. Mutations could bring certain advantages for the change of cell fate, thus representing a strong mutagenic factor. Subsequent proliferation and adaptation to the in vitro culture conditions is another important cause of mutations, although common for other cell lines too, including the ESC where gross mutations have already been noted [73–75].
Observed iPSC mutations range from chromosomal aneuplody, subchromosomal deletions or duplications to single base mutations. From the selected number of iPSC analyzed, as many as 20% had gross chromosomal aberrations, including complete trisomies (9% of total) . Another study focused on copy number variation (CNV) (approximately 0.6–12,000 kb stretches of genomic DNA) of a large number of pluripotent and somatic samples. iPSC had on average 17 CNV per sample . As a comparison ESC also had 17 and nonpluripotent samples had 12 CNV. Focusing only on the exome, an iPSC line has on average about six mutations, most of which are predicted to alter protein function . Surprisingly, in all the studies focusing on the genetic aberrations so far there was no correlation between the extent of genetic aberrations and the reprogramming method (combination of transcription factors used, Myc oncogene, the genomic integration vs. nonintegrative methods) or iPSC propagation method (mechanical or by trypsin).
Importantly, some of the aforementioned mutations were shown to be already present in the (small) fraction of the somatic cells that the iPSC were derived from [49, 70, 72]. Another interesting study showed that early passage iPSC bear a significant number of CNV that actually attenuates during subsequent intermediate length passaging, finally descending to the average number of CNV per ESC lines or fibroblasts . The elimination of the CNV in iPSC population is possible because many are present in mosaic fashion (i.e., only a certain part of the cell population has the mutation). Three conclusions can be made from that observation; the low efficiency and long duration of reprogramming increases the mutational load that could help in breaching some roadblocks along the way to pluripotency. Next, some of these changes seem to be deleterious for the final homeostatic state of the cell, and are therefore eliminated from the population. Finally, it stresses the importance of cell-cell communication and signaling during reprogramming. As the mutations are mosaic in the iPSC, they bring advantages to the whole population of cells, including the mutation-free sister cells that reach the pluripotent state together with the mutation bearing ones.
Categorizing the genes in iPSC affected by mutations brings forward, if any, genes implicated in the cell cycle regulation and tumors [46, 70]. Although the spread of the mutations in the genome of iPSC is pretty wide, there are some more frequent mutations and chromosomal aberrations that are common to late passage human ESC, too. Particularly, chromosomes 12, 20, and parts of chromosome 1 are being affected in both cell types, as well as isolated genomic regions close to DNMT3B, NANOG, and GDF3. Overall, even though iPSC obviously need to pass through one additional selection process, reprogramming, iPSC and ESC are not drastically different considering the mutational load. In addition, there is some correlation between mutations found in late passage human ESC, iPSC, and cancers cells [75–77]. Thus, pluripotent cells cultured in vitro are by definition proliferating and can acquire specific aberrations that support growth advantage, similar to tumor progression, that eventually take over the population. Currently, a lot of research effort is put into the better understanding of reprogramming processes and into increasing its efficiency, which could further diminish genomic instability in iPSC. For example, as many as 10% of fibroblasts get reprogrammed by the miRNA302/367 cluster . It will be interesting to see whether these iPSC have fewer mutations accumulated when compared with the conventional iPSC (that get around 0.1% of the starting cell population reprogrammed). In addition, profiling of the mutational and epigenetic aberrations has to be performed to distinguish those that pose clinical risk from those that are actually harmless.
iPSC must have convinced even the most skeptical minds of their developmental potential and pluripotency when tetraploid complementation resulted in viable adult mice [12, 13]. It is a definite proof of principle that four transcription factors are able to modify a differentiated cell all the way to the pluripotent state of ESC. The current problem with iPSC lies in their low efficiency of derivation and the heterogeneity of the obtained colonies.
Not all mouse iPSC lines are able to successfully complement 4n blastocysts . In the reprogramming process only a fraction of colonies appear that are considered “good quality” iPSC. The first important step is therefore to select only for the iPSC that have reached this fully reprogrammed state. As they are morphologically and transcriptionally very similar to the lesser quality iPSC, a detailed analysis is currently required for the selection of the good ones. Alternatively, a good marker is needed. One such marker might be Dlk1-Dio3 locus [10, 30].
The majority of epigenetic aberrations in iPSC are only present in early passage numbers, and therefore can be considered transient epigenetic memory [58, 67]. However, it has also been noted that some transcripts and chromatin marks are persistent in the later passage number iPSC and even in their differentiated progeny . In addition to this failure to erase all the features of the differentiated cell state, iPSC also seem to be able to accumulate ESC-dissimilar transcripts and chromatin marks that are not related to the cell of origin. They can not be grouped under the same nominator, corroborating their stochastic nature. Additionally, genomic mutations can arise during reprogramming. All of these aberrations likely appear because of the imperfection of the reprogramming procedure, as illustrated by the low efficiency of reprogramming. Efforts have to be made to improve culture conditions and factors  available to the cells during reprogramming, which could lower the number of stochastic steps the cell needs to breach in order to achieve pluripotency. This would likely increase the ratio of true versus bad quality iPSC colonies and lower the aberrations present in the cells.
In the end, iPSC continue to offer much promise for both clinical applications with personalized medicine, and for basic research in developmental and cell biology. The iPSC research field is still unfolding, and with the current attention it holds in the scientific community, the iPSC safety issues discussed here should be addressed in the near future.
This work was partially supported by Juan de la Cierva (to J.B.). Also the work in the laboratory of J.C.I.B. was supported by Ministerio de Ciencia e Innovacaion, The Leona M. and Harry B. Helmsley Charitable Trust, the G. Harold and Leila Y. Mathers Charitable Foundation, Fundacion Cellex, Sanofi-Aventis, and the California Institute for Regenerative Medicine.
DISCLOSURE OF POTENTIAL CONFLICTS OF INTEREST
The authors indicate no potential conflicts of interest.