New associations of serum β‐carotene, lycopene, and zeaxanthin concentrations with NR1H3, APOB, RDH12, AND CYP genes

Abstract Variation in carotenoid bioavailability at individual and population levels might depend on host‐related factors where genetic variation has a part to play. It manifests itself through the proteins involved in carotenoid intestinal absorption and metabolism, blood lipoprotein transport, or tissue uptake. This study aims to identify novel SNPs which could be associated with carotenoid serum concentrations. A total of 265 self‐reported healthy individuals of Lithuanian origin were genotyped (Illumina HumanOmniExpress‐12v1.0 or v1.1 and Infinium OmniExpress‐24v1.2 arrays) and fasting blood serum concentrations of β‐ and α‐carotene, β‐cryptoxanthin, lycopene, lutein, and zeaxanthin were measured (Shimadzu Prominence HPLC system). According to the individual carotenoid concentrations, the cohort was subdivided into quartiles. Q1 and Q4 were used for the following association analysis. The set of 2883 SNPs in 109 potential candidate genes (assumed for a direct or indirect role in carotenoid bioavailability) was analyzed. Liver X receptor alpha (NR1H3) “transport” polymorphisms rs2279238 (p = 2.129 × 10−5) and rs11039155 (p = 2.984 × 10−5), and apolipoprotein B (APOB) “transport” polymorphism rs550619 (p = 4.844 × 10−5) were associated with higher zeaxanthin concentration. Retinol dehydrogenase 12 (RDH12) “functional partner” polymorphism rs756473 (p = 7.422 × 10−5) was associated with higher lycopene concentration. Twenty‐one cytochrome P450 (CYP2C9, CYP2C18, and CYP2C19) “metabolism” polymorphisms in locus 10q23.33 were significantly associated with higher β‐carotene concentration. To conclude, four novel genomic loci were found to be associated with carotenoid serum levels. Zeaxanthin, lycopene, and β‐carotene serum concentrations might depend on genetic variation in NR1H3, APOB, RDH12 and CYP2C9, CYP2C18, and CYP2C19 genes.


| INTRODUC TI ON
Carotenoids found at the highest concentrations in human blood are the same as those commonly found in food productsβ-carotene, lycopene, lutein, β-cryptoxanthin, α-carotene, and zeaxanthin (Khachik et al., 1992).
Total and individual consumption of carotenoids varies greatly between and within populations, and this is mainly reflected in the consumption of fruits and vegetables. Bioavailability is also a subject of variation. It depends on the processes of bioaccessibility, absorption, tissue distribution, turnover, and excretion or the possible interactions of individual carotenoids. However, the underlying mechanisms are still poorly understood (Bohn et al., 2017).
All those processes may be influenced by many factors including diet (e.g., food matrix, fat); host-related factors such as gender, age, medical conditions (e.g., HIV, hyperthyroidism); lifestyle (e.g., physical activity, smoking), and genetic factors. Studies have shown that interindividual variability in the bioavailability of carotenoids can be modulated by single-nucleotide polymorphisms (SNPs) (Borel et al., 2011(Borel et al., , 2014(Borel et al., , 2015a(Borel et al., , 2015bLietz et al., 2012;Merle et al., 2013). These belong to the genes involved in the intestinal uptake or efflux of carotenoids as well as carotenoid metabolism and transport.
The general metabolism of carotenoids involves the following: (1) release from the food matrix, (2) solubilization into mixed micelles, (3) uptake by intestinal cells, (4) incorporation into chylomicrons or high-density lipoprotein (HDL), (5) secretion into the lymph and circulation, and (6) tissue uptake and retention (Shmarakov et al., 2013). As nicely summarized by Bohn et al. (2017), the pro-  (Bohn et al., 2017;Moran et al., 2018). Various proteins participate in carotenoid metabolic pathways but not all of them are known yet.
Despite the overwhelming success of association studies in general, only some studies have analyzed carotenoid concentrations (Buniello et al., 2019). Most of them were performed in populations of mixed ancestry (admixed populations) and there is a growing demand for such studies in distinct or ethnic populations (Sirugo et al., 2019) to replenish current knowledge of associated genetic factors. This study aimed to search for new genetic loci associated with serum carotenoid concentrations in the Lithuanian population cohort.

| Study cohort
A total of 265 self-reported healthy genetically unrelated individuals (131 men aged 50 ± 10 years and 134 women aged 49 ± 9 years) from the Lithuanian population (six ethnolinguistic regions) with at least three generations living in Lithuania were recruited for this study. As it was part of the LITGEN project (Lithuanian Population Genetic Diversity and Structure Variations Associated with the Evolution and Most Common, Prevalent Diseases), which aimed to analyze the genomic structure of the Lithuanian population, this study had several inclusion criteria: (1) study participants had to be of Lithuanian origin; (2) families of all participants must have been living in their ethnolinguistic region for at least three generations. We did not include individuals who were not of Lithuanian origin. The study was approved by the Vilnius Regional Biomedical Research Ethics Committee (No. 158200-05-329-79). Informed written consent from all participants was obtained.

| Collection of blood, sample preparation, storage, and transportation
Study individuals had to arrive at the primary healthcare center located in the city or district they lived from 7:30 to 10:00 a.m. in fasting condition for at least 12 h, abstaining from smoking, alcohol, and medications. In the facility, venous blood samples were taken for serum carotenoid analysis (5 ml BD Vacutainer ® SST II Advance tubes; Becton Dickinson) and DNA extraction (10 ml BD Vacutainer ® K2EDTA tubes; Becton Dickinson).
For serum sample preparation, after 40 min, tubes with blood samples were centrifuged for 10 min at 1150 g (centrifuge LMC-3000, Biosan). Samples were stored at +2 to +8°C and transported in thermostable containers (polystyrene foam boxes) to avoid temperature variations. Within 3-6 h, serum samples were transported to the Centre of Laboratory Medicine of the Santaros Clinics, Vilnius University Hospital and deep-frozen at −80°C. Carotenoid concentration was analyzed within 6-12 months after the sample collection. Sample collection, preparation, transportation, and all further stages of the analysis were carried out in such way as to avoid direct sunlight or intensive artificial light in order to prevent the dissociation or isomerization of carotenoids.

| Genotyping
The genotyping was performed on the Illumina HiScan™SQ instrument (Illumina Inc.) using the Illumina Infinium ® HD SNP assays
Chromatographic separation was performed using an HPLC system (Shimadzu Prominence) with C30 (250 mm × 4.6 mm; particle size 5 μm) HPLC column (Dr. Maisch GmbH) and guard precolumns (C30, 20 mm × 4.6 mm). The column temperature was 23°C ± 1°C and the injection volume was 30 μl. Carotenoids were separated with a two-component mobile phase of MeOH (Eluent A) and MTBE (Eluent B). Flow rate was set at 1.3 ± 1 ml/min and the gradient elution was as follows: 10% Eluent B (initial), 10%-45% α-carotene, β-carotene, β-cryptoxanthin, lutein, zeaxanthin, and lycopene were not lower than 0.995. The amounts of carotenoids were calculated from the regression equations. Duplicate analyses were carried out and the data were expressed as mean ± standard deviation.

| Set of genes
Genes for the analysis were compiled into three groups based on a shared biological or functional property related to carotenoid bioavailability. The set of genes and SNPs in detail is provided in Table S1.
The first group consisted of 43 genes (1040 SNP markers) encoding proteins related (or possibly related) to carotenoid uptake, distribution, metabolism, and excretion, for instance, digestion enzymes fostering micellization (PNLIP), uptake/efflux transporters (SCARB1, CD36, NPC1L1), intracellular transporters (FABP2), those participating in the processes of secretion into chylomicrons (APOB, MTTP), carotenoid metabolism in blood and liver (LPL, APOE, LDLR), and distribution to target tissues such as adipose tissue or the mac- The second group consisted of 37 genes (1073 SNP markers) encoding proteins responsible for intracellular carotenoid cleavage (BCO1, BCO2) and their functional partners or transcription factors.
The third group consisted of 29 genes (770 SNP markers) encoding cytochrome P450 enzymes that are related to retinol metabolism (according to The Human Protein Atlas (Uhlen et al., 2015)).

| Statistics
Descriptive statistics for serum carotenoid concentrations were calculated by open-source software RStudio (RStudio Team, 2020).
According to the carotenoid concentrations, individuals were subdivided into quartiles. Quartiles Q1 and Q4 were used for the following association analysis.
Genotyping data quality control (QC) and association analysis were performed using the PLINK whole-genome association analysis toolset (PLINK v1.90b (Chang et al., 2015)). SNPs included in the association analysis met the following genotyping data quality control criteria (Anderson et al., 2010): minor allele frequency (MAF) > 0.05; missingness per marker (GENO) < 0.01; the Hardy-Weinberg equilibrium test's p-value >.001 (chi-squared test); missingness per individual (MIND) < 0.05. The chi-squared statistic was used for association analysis to evaluate differences in allele frequencies in each SNP between the Q1 and Q4 quartiles. Significant SNPs were provided with the odds ratio (OR) and 95% confidence interval (95% CI) calculations. The Bonferroni adjustment (α = 0.05/N) and adaptive permutation procedure were performed for multiple comparisons. The significance level for the analysis was set according to the adaptive permutation recommendations (Che et al., 2014).
The power of the test values, according to the different sample size groups, was 0.77-0.79 and was calculated using post hoc calculation with the G*Power 3.1.9.4 tool designed by Franz Faul, University of Kiel, Germany (Faul et al., 2007).

| RE SULTS
The characteristics (age, gender, body mass index (BMI)) of the study cohort of 265 individuals are shown in Table 1. Mean values of age and BMI were similar in both gender groups. According to the BMI values, the study cohort could be classified as overweight.
Based on the questionnaire sociodemographic data, it was found that the majority of studied individuals resided in an urban setting (61%); 85% of the study cohort indicated as having special secondary or higher education and the majority of participants worked as employees (31%) or clerks (46%). The study cohort was not characterized by high physical activity; 43% of individuals had lifestyles that are not physically active and 47% of individuals had lifestyles involving small physical activity, including walking and cycling to and from work, easy gardening, fishing, etc.
The predominant blood serum carotenoid in the study cohort as well as among the group of men was found to be lycopene (33% and 36%, respectively) while among women, it was β-carotene (34%).
Zeaxanthin concentration was the lowest in all groups (1%). Mean and median values of the analyzed carotenoids in the study cohort are reported in Table 2. To perform association analysis, carotenoid concentrations were subdivided into quartiles to form groups of samples to be compared (Table 3).
Before association analysis, data quality control (procedure described in Section 2.6) of 265 genotyped samples and 2883 SNPs was performed. All samples passed the 5% threshold of the missing genotypes rate. One hundred and fifty-six SNPs did not pass the 1% missingness rate, one SNP deviated from the Hardy-Weinberg equilibrium, and 973 SNPs with rare alleles of frequency <5% were excluded from further analysis. Finally, 1 753 SNPs (gene group I had 710, group II 661, and group III 382 SNPs) and 256 samples were set for subsequent association analysis. After association analysis of sets of SNP markers in the two groups of individuals subdivided according to the carotenoid concentrations (quartiles Q1 and Q4), we found new genetic loci associated with the serum carotenoid concentrations. Significant SNPs along with the chi-squared, p-values, and OR with 95% CI are presented in Table S2. Significant associations are depicted in Figure 1.
From analyzing the association of the serum carotenoid concentrations and SNPs of the first gene group, we found SNP rs550619 of the APOB gene to be significantly associated (p = 4.844 × 10 −5 ) with blood serum zeaxanthin concentration. Minor allele G is related to the higher blood serum zeaxanthin concentration. The SNP is located in intron 5 (out of 28) of the APOB gene encoding apolipoprotein B.
Analyzing the second gene group, we found three SNPs to be significantly associated with blood serum carotenoid concentration. SNPs rs11039155 (p = 2.984 × 10 −5 ) and rs2279238 (p = 2.129 × 10 −5 ) of the NR1H3 gene were associated with higher zeaxanthin concentration, while SNP rs756473 (p = 7.422 × 10 −5 ) of the RDH12 gene was associated with higher lycopene concentration.
Intronic variant rs11039155 and synonymous variant rs2279238 are both located in the NR1H3 gene encoding nuclear receptor subfamily 1 group H member 3. SNP rs756473 is located in intron 3 (out of 8) of the RDH12 gene encoding retinol dehydrogenase 12.
Analyzing the third gene group, we found 21 SNPs of the locus 10q23.33 encompassing genes CYP2C9, CYP2C18, and CYP2C19 to be significantly associated with the higher β-carotene concentration.

TA B L E 1 Main characteristics of the study cohort
The serum α-carotene, β-cryptoxanthin, and lutein concentrations were not associated with any of the studied genetic markers (SNPs).

| DISCUSS ION
The Lithuanians are higher than in healthy American, Chinese, and Korean adults (Yeum et al., 1999). The predominant blood serum carotenoid in our study cohort was lycopene, and zeaxanthin concentration was the lowest. These findings are in line with the results presented in the study mentioned above of other European countries (Al-Delaimy et al., 2004) where lycopene was quantitatively a predominant carotenoid, followed by β-carotene, while α-carotene and zeaxanthin levels were the lowest. Lycopene and β-carotene were major carotenoids in American individuals, whereas lutein was the predominant carotenoid in the Chinese (Yeum et al., 1999). Moreover, serum carotenoid concentrations differed across ethnic groups (Sanchez et al., 2021). These findings suggest that population-specific variability might significantly predict total and individual blood carotenoid levels in different geographical regions.
This study aimed to identify new SNPs and genes related to Zeaxanthin was associated with the APOB and NR1H3 genes.
APOB encoding apolipoprotein B occurs in plasma in two isoforms: apoB-100 and apoB-48. apoB-48 is the main apolipoprotein constituent of chylomicrons while apoB-100 is a component of LDL and VLDL. The majority (55%) of carotenoids are transported as the circulating form of LDL components, while zeaxanthin involves LDL and HDL (Shmarakov et al., 2013). Studies have shown significant associations of SNPs in the APOB gene with β-carotene (Borel et al., 2007(Borel et al., , 2015a, lycopene (Borel et al., 2007(Borel et al., , 2015b, and lutein (Borel et al., 2014) bioavailability and concentration in the blood. For the first time, we report that APOB SNP (rs550619) is found to be significantly associated with zeaxanthin serum concentration. The minor allele (G) of the SNP is related to the higher blood serum zeaxanthin concentration (OR 7.3) in the studied Lithuanian population.
Our findings correspond with others claiming that SNPs in the APOB gene may modulate carotenoid blood concentration, indicating that chylomicron assembly and lipoprotein clearance are important factors in determining carotenoid blood status.
The other gene associated with zeaxanthin concentration was NR1H3. Our study showed that minor alleles of the SNPs rs11039155 and rs2279238 were related to the higher zeaxanthin blood serum concentration (ORs 3.5 and 3.4, respectively).
The NR1H3 gene was never previously associated with carotenoid blood levels. NR1H3 codes for nuclear receptor subfamily 1 group H member 3 also known as oxysterols receptor LXR-alpha. The protein forms a heterodimer with retinoid X receptor (RXR) and regulates the expression of target genes containing retinoid response elements. Alternatively, spliced NR1H3 gene transcripts translated to the different isoforms of the protein have been found. Studies suggest that NR1H3 plays an important role in the regulation of cholesterol homeostasis (Edwards et al., 2002) and increased expression of the ATP binding cassette transporters ABCA1, ABCG1, and apolipoprotein E (apoE), all of which participate in the transfer of intracellular and plasma membrane cholesterol to HDL (Costet et al., 2000;Laffitte et al., 2001). Our findings suggest that NR1H3 may indirectly participate in carotenoid metabolism, which is closely TA B L E 2 Analyzed blood serum carotenoid concentrations (μmol/L)  (Belyaeva et al., 2005;Haeseleer et al., 2002). Retinal is a form of vitamin A and may be produced from provitamin A carotenoids: α-carotene, β-carotene, or βcryptoxanthin but not from non-provitamin A carotenoid such as lycopene. Even though lycopene is not accumulated in the retina, lycopene isomers were found in the eye structures (Khachik et al., 2002).  (Marill et al., 2000;Nadin & Murray, 1999). β-Carotene is a pro-vitamin A carotenoid and can be metabolized into RA; thus, our finding suggests that genetic variation in the CYP2C subfamily enzymes' coding genes may impact β-carotene serum level.
It should not be forgotten that even though SNPs represent more than 96% of variation (Genomes Project C et al., 2015), other genetic variations such as copy number variants, deletions and/or insertions of several nucleotides, as well as epigenetic modifications are also present and might have an effect. We should therefore consider all of the genetic variations that can have a significant impact on carotenoid concentrations. It is also crucial to perform association studies in different populations to find out population-specific variation. population is unique and why some genetic associations found in other studies do not reproduce.
As for the study limitations, we denote that the study cohort was extracted from the representative cohort of the Lithuanian population subdivided into six ethnolinguistic regions of Lithuania ( Jakaitienė & Kučinskas, 2013). Despite potential interest in the precursors of carotenes and their biological effects, the main goal of the study was to broaden the knowledge of genetic variation related to interindividual variability in the bioavailability and metabolism of carotenoids, which are found at the highest concentrations in human blood, are the most abundant in food products, and are linked to health benefits in epidemiological studies. We also state that generalization of the present findings should be viewed taking into account that lycopene intake was found to be among the lowest in the Lithuanian population compared with other European counties (Mažeikienė et al., 2015), and assessment of total carotenoid consumption among Lithuanians is lacking. Additional studies should be performed to differentiate the importance of nutritional habits and genetic variability on serum carotenoid concentrations.

| CON CLUS IONS
In conclusion, this study identified four new loci associated with zeaxanthin, lycopene, or β-carotene serum concentrations. Our findings reproduce the idea that carotenoid bioavailability depends on genetic variation. We also note that some associations might be population specific as performed in the ethnic Lithuanian population. The role of associated genes in carotenoid metabolism is not characterized well. However, our significant results contribute to the current science with new insights into interindividual variation in carotenoid bioavailability and raise further questions to be answered.

ACK N OWLED G EM ENTS
We are grateful to all study participants who donated blood samples. We acknowledge that we presented our initial results as a short thesis and e-poster at the European Society of Human Genetics Conference 2020 (published in Abstracts from the 53rd European Society of Human Genetics (ESHG) Conference: Interactive e-Posters. Eur J Hum Genet 28, 141-797 (2020)). We additionally complemented our initial results for this manuscript.

CO N FLI C T O F I NTE R E S T
All authors declare no conflicting interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
All necesary data is provided in the article. Nevertheless authors agree to share raw data upon request.