Bioconversion of lignocellulosic ‘waste’ to high‐value food proteins: Recombinant production of bovine and human αS1‐casein based on wheat straw lignocellulose

Lignocellulosic biomass is the most abundant bio‐resource on earth, mainly composed of D‐glucose, D‐xylose and L‐arabinose. It is widely considered to be a promising alternative feedstock for biotechnological processes. Here we evaluated its potential to be the carbon source for growth of broadly distributed and well‐established Escherichia coli laboratory and protein expression strains as well as a classic probiotic E. coli strain. E. coli DH5α, E. coli K12‐MG1655, E. coli K12‐W3110, E. coli BL21(DE3) and E. coli Nissle 1917 were cultivated in mineral media containing single lignocellulosic sugar components. Sugar consumption in these cultures and growth parameters of the different strains were characterized. enhanced green fluorescent protein (eGFP) was chosen as a first easy to measure and prominent model recombinant target protein to demonstrate lignocellulose‐dependent recombinant protein production in E. coli. To open new production routes for high value food proteins based on lignocellulose, structural genes encoding bovine αS1‐casein and human αS1‐casein were synthesized, cloned and then expressed in an E. coli T7 expression system in different media based on single sugars and a synthetic wheat straw mixture. Successful recombinant production of both bovine and human αS1‐caseins in E. coli under these experimental conditions was demonstrated and quantified by densitometric analysis after protein separation in polyacrylamide gels. Finally, efficient casein production in E. coli based on a real hydrolysate obtained by steam explosion of wheat straw lignocellulose in a bioreactor‐based batch production process was successfully demonstrated. We believe that this proof‐of‐concept presented here is a promising starting point to open new routes for the production of food or feed proteins with high nutritional and economic value. As such, a valorization of bulk residual biomass like lignocellulose is envisioned as a key support of a growing and truly sustainable bioeconomy.


| INTRODUCTION
Promising economic and ecologic options arising from the use of renewable energies and resources have meanwhile raised attention all over the world, due to the mostly undoubted rising global challenges as posed to mankind by the climate change, decline of environmental integrity and diversity, energy prices, energy long-run supply problems and impacts of these challenges also on human health (Hansen et al., 2006;McMichael et al., 2006;Schröter et al., 2005). A renewable source of considerable importance is lignocellulosic 'waste', which accumulates in large quantities every year in the form of residues from agriculture, forestry, energy crops, as well as residues from paper-pulp industries, timber industries and many agro-industries (Saini et al., 2015). Among these basic resources, straw residues (from corn, wheat and rice) dominate in terms of tonnage and can be expected to serve as global available feedstocks, but available quantities of straw substantially vary within different regions of the world. According to statistical data provided by the Food and Agriculture Organization of the United Nations (FAO) concerning the production of biomasses (e.g. crop production, yields, harvest areas, etc.) it is shown that in average from 1994 to 2018 the USA is the largest producer of corn, accounting for 52% of the currently 1.05 billion tons globally produced and the People's Republic of China with 20% of global production being on the second place (FAOSTAT, www.fao. org/faostat). With respect to annual global production of rice of about 750 million tons, Asia is the primary production region with over 90% of the global production coming from the world largest harvest areas. For the global production of wheat (about 750 million tons) Asia (43%) and Europe (32%) are the primary production regions. Statistical data for the annually available quantity of biomass suggest that wheat straw is potentially the most favourable lignocellulosic resource in Europe. Lignocellulosic feedstocks are mainly composed of cellulose, hemicellulose and lignin (McKendry, 2002). Cellulose is an unbranched homo-polysaccharide consisting of D-glucopyranosyl units. In contrast, hemicelluloses are branched hetero-polysaccharides consisting of both hexose (D-glucose, D-mannose and D-galactose) and predominantly pentose sugars (D-xylose and L-arabinose). As a result, lignocellulose biomass contains approximately 75% of polysaccharide sugars (Bayer et al., 2007). Different technical processes such as water steam explosion or organosolv focus on conversion and release of these lignocellulosic carbohydrates via pretreatment and subsequent enzymatic hydrolysis to gain the so-called lignocellulose hydrolysates (Alvira et al., 2010;Domínguez de María et al., 2015;Mosier et al., 2005;Taherzadeh & Karimi, 2008). During these processes aimed to obtain fermentable sugars, several potentially inhibitory molecules for microbial growth are formed as by-products. These include cellulose or hemicellulose-derived furan aldehydes and aliphatic acids as well as lignin-derived phenolic compounds. The complex profile of different products of pretreatment processes along with potentially inhibitory substances has been reviewed on several occasions (Chandel et al., 2011;Horlamus, Wang, et al., 2019;Sun, & Cheng, 2002). Therefore, lignocellulose biomass has been suggested to be the most suitable feedstock to provide monosaccharides, which then can serve as carbon sources for biotechnological processes based on (optimized) microbial biocatalysts in novel fermentative routes Mussatto & Teixeira, 2010;Peters, 2006;Wang et al., 2019).
Most bacterial organisms can only utilize glucose due to the lack of enzymes needed for degradation and oxidation of different sugars. Fewer bacteria can naturally utilize more "exotic" other monosaccharides like xylose, arabinose and mannose, complex carbohydrates such as disaccharides (lactose, cellobiose and xylobiose) or even polysaccharides (starch, xylan and cellulose). As most common laboratory strains, Escherichia coli strains have the ability to utilize hexoses as well as pentoses, which makes this microorganism an interesting host not only for traditional applications, but also in the context of novel metabolic pathways and the mentioned non-traditional carbon sources (Calero & Nikel, 2019;Idalia & Bernardo, 2017;Singh & Mishra, 1995). More importantly, E. coli has been the workhorse in microbiology for decades including an exceptional role as a cell factory and it has become the most popular expression platform, because it can quickly and easily grow on inexpensive substrates and can simply be modified by a variety of molecular tools (Rosano & Ceccarelli, 2014;Sharma & Chaudhuri, 2017). The ability to express desired recombinant proteins in large quantities allows it to be used in the development of industrial enzymes and biopharmaceuticals with to date more than 150 recombinant pharmaceutical proteins that have been licensed by Food and Drug Administration (Ferrer-Miralles et al., 2009). Insulin is probably the most prominent example (Leader et al., 2008;Rosenfeld, 2002), but interferons (IFN-α, -β and -γ), growth hormones and antibodies can also be considered as breakthrough products coming from E. coli-based processes. Another field in E. coli biotechnology K E Y W O R D S bioeconomy, D-xylose, Escherichia coli, hydrolysate, L-arabinose, Lignocellulose, wood sugar, α S1 -casein is the production of industrial enzymes like proteases, amylases, lipases, cellulases and pectinases which are of high relevance in special but expanding market (Sanchez & Demain, 2011;Sarmiento et al., 2015).
Milk is the first and basic food for mammals including humans and still it can be regarded as the best nutritional option for new-born infants. Within the milk proteins as important constituents casein is the major component in almost all mammalian species, accounting for up to 80% of the total protein in bovine milk (Kim et al., 1997). In general, the milk casein fraction consists of α s1 -, α s2 -, β-, and κ-casein (Wal, 1998). The two α-type and the single β-type caseins are characterized by the formation of aggregates in the form of micelles in solution and their ability to sequester up to 5% of their dry weight as Ca 2+ (Koczan et al., 1991). One of the biological functions of the casein for the young is to serve as a main and extremely well-balanced source of essential amino acids with a biological value (BV) higher than chicken egg and soya (Chanat et al., 1999;Hoffmann, & Falvo, 2004). The BV is a measure of how efficiently an absorbed food protein can be used for protein biosynthesis by the organism compared to protein from chicken egg as the reference value. In bovine milk α S1 -casein is the most abundant protein, accounting for 34% of the total milk proteins. Initially discussed as being absent in humans it was reported later that human casein also contains α-casein (α S1 -casein), although it is present only in very small amounts accounting for only 0.06% of the total protein content in human milk (Cavaletto et al., 1994). With an increasing population preferring a vegetarian nutritional life style in the western world, casein proteins are not only considered as ideal nutrients from milk, but also have a promising potential use as healthy food additives in food industries due to its biological functions and especially its particular amino acid composition. In previous studies, α S1 -casein-like milk protein was used to be isolated and purified from bovine and human milk (Rasmussen et al., 1995). However, casein isolated from milk is not acceptable for those who prefer vegan nutrition. Thus, subsequently, in the 1990s the successful production of casein proteins using recombinant DNA technology by isolating and cloning genes encoding bovine/human α and β caseins from cDNA library was reported (Kim et al., 1997).
In this study, four prominent E. coli laboratory strains and a probiotic strain were chosen as examples to examine their capability to grow on single sugars representing the main components of lignocellulosic biomasses. E. coli DH5α is one of the most commonly used strains for cloning experiments and plasmid maintenance due to its high transformation efficiency and recA mutation to avoid heterologous recombination (Chan et al., 2013). The E. coli K12 strains MG1655 and W3110 are among the oldest E. coli laboratory strains and were cured from the F plasmid and phage lambda (Bachmann, 1972;Jensen, 1993). MG1655 was also chosen for the first published genome sequence of E. coli K12 (Blattner et al., 1997). For highly efficient protein production using the T7 expression system, the strain BL21(DE3) was generated especially by knock-out of two key proteases and is the most used protein overexpression strain in laboratories world-wide (Studier & Moffatt, 1986). In contrast, the probiotic strain E. coli Nissle 1917 is a well-established medical product available as Mutaflor® (Ardeypharm). Moreover, with the fluorescent protein eGFP a first model for the ability of each strain to produce proteins based on these sugars was tested. As the main objective of this study we have also demonstrated the recombinant production of bovine and human α S1 -casein proteins on sugar mixtures emulating lignocellulosic hydrolysates from wheat straw residues as sole carbon sources. Finally, efficient casein production in E. coli based on a real hydrolysate of wheat straw lignocellulose in a bioreactor-based batch production process was successfully demonstrated. The animal-free decoupling of food production from traditional agriculture and the introduction of biotechnological food proteins represents a novel food technology that will contribute to global health, food security and sustainability. Especially zoonoses such as the current Coronavirus disease 2019 but also salmonellosis and others stress the need for animal-free alternatives of high-value protein supply in combination with climate-neutral production. We believe that this proof-of-concept presented here may be a promising starting point to open general new routes for the production of food or feed proteins with high nutritional and economic value based on bulk residual biomass like lignocellulose for the support of a growing and truly sustainable bioeconomy.

| Bacterial strains and plasmids
The strains and plasmids used in this study are listed in Table 1. E. coli DH5α was also used for cloning procedures and plasmid maintenance.

| DNA manipulations
DNA manipulations were carried out using established methods as described in Sambrook and Russell (2001). Restriction enzymes and T4 DNA ligase were obtained from Thermo Fisher Scientific and used as recommended. Plasmid DNA was isolated with QIAprep spin miniprep kit (Qiagen). DNA concentrations were measured with a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific). DNA fragments were recovered from agarose gels by using a QIAEXII gel extraction kit (Qiagen). E. coli cells were transformed with the resulting recombinant plasmids (Table 1) using a standard protocol (Hanahan, 1983).

| Media and growth conditions
Escherichia coli strains were grown at 37°C in either lysogenic broth (LB) medium (10 g/L tryptone, 5 g/L yeast extract, 10 g/L NaCl; pH 7.0), Wilms-KPi medium or wheat straw hydrolysate medium. The adapted Wilms-KPi medium (Wilms et al., 2001) was composed of a phosphate buffer system (6.58 g/L K 2 HPO 4 , 1.64 g/L KH 2 PO 4 , 5 g/L (NH 4 ) 2 SO 4 , 0.5 g/L NH 4 CL, 2 g/L Na 2 SO 4 , 25 g/L MgSO 4 ·7H 2 O; pH 7.4) supplemented with 3 ml/L of a trace element solution (0.18 g/L ZnSO 4 ·7H 2 O, 0.16 g/L CuSO 4 ·5H 2 O, 0.1 g/L MnSO 4 ·H 2 O, 13.92 g/L FeCL 3 ·6H 2 O, 10.05 g/L ethylenediaminetetraacetic acid Titrieplex III, 0.18 g/L CoCL 2 ·6H 2 O, 0.662 g/L CaCL 2 ·2H 2 O) and 0.01 g/L thiamine HCl. All components were sterilized using a 0.2 µm membrane filter, except for the phosphate buffer, which was sterilized by autoclaving. A quantity of 10 g/L D-glucose, D-xylose or L-arabinose was used as sole carbon source. In artificial wheat straw hydrolysate medium, a total sugar amount of 10 g/L with the same composition as in typical wheat straw hydrolysates was added: 6.29 g/L D-glucose, 3.30 g/L D-xylose and 0.41 g/L L-arabinose (Schläfle et al., 2017). In bioreactor cultivations a dried and milled wheat straw was used as substrate. Based on steam explosion process, hemicellulose and lignocellulose components were disrupted and further degraded into monosaccharides by an enzymatic hydrolysis process over 5 days (Schläfle et al., 2017). Particles were removed by centrifugation and the supernatant was sterilized through filtration (Nalgene Rapid Flow 0.2 µm, Thermo Fisher Scientific). After quantification of the carbohydrate monomers the hydrolysates were used as carbon source for the bioreactor cultivations. For the selection of recombinant strains, 100 µg/ml ampicillin or 10 µg/ml tetracycline was added to the media. For induction of gene expression, 0.5 mM of isopropyl β-D-thiogalactoside T A B L E 1 Strains and plasmids used in this study

Strain or plasmid Characteristics Source/Reference
Strains  (IPTG) was supplemented when the cultures were grown to a density of approximately OD 600 of 0.5-0.8. For shaking flasks experiments precultures of 15 ml LB medium in 100-ml Erlenmeyer flasks were inoculated with a single colony from an LB agar plate and incubated overnight in a incubator shaker (New Brunswick Scientific) at 37°C and 150 rpm. Main cultures with 15 ml of defined medium in 100-ml Erlenmeyer flasks were inoculated to an OD 600 of 0.1 from the precultures. For bioreactor cultivations first precultures were cultivated for 8 h in 10 ml LB medium (100-ml flasks) before a volume of 100 µl cell suspension was used for the inoculation of the seed culture with 25 ml Wilms-KPi medium (250-ml flasks) containing 10 g/L glucose for further 8 h of cultivation. The bioreactor cultivation was carried out as duplicate in a 2 L bioreactor (Labfors 4; Infors AG) using 600 ml Wilms-KPi medium including 380 ml of pretreated lignocellulose hydrolysate, resulting in concentration of 7.5 g/L glucose and 5.5 g/L xylose. Aeration rate was set to 0.2 vvm with pO 2 set at 20% regulated by stirring speed. The temperature was set to 37°C and pH was maintained at 7.4 using 1 M H 2 SO 4 and 1 M NaOH. After inoculation with seed culture medium to an OD 600 of 0.1, cultivation was conducted for 18 h. Heterologous gene expression was induced at OD 0.6 using a final concentration of 0.5 mM ITPG.

| Analytical methods
Cell growth was determined densitometrically by measuring the optical density at 600 nm. The culture samples were centrifuged at 4°C and 15,000 g for 5 min. Cell pellets and supernatant were stored at −20°C for later analyses.
For the shaking flasks experiments, the culture supernatants were analysed for residual sugars by using the D-Glucose assay kit, D-Xylose assay kit and L-Arabinose/D-Galactose assay kit (Megazyme). Carbohydrate analysis of the samples from the bioreactor was conducted by separation with a HPTLC system (CAMAG), followed by staining with diphenylamin-aniline (DPA) reagent, heating for 20 min at 120°C and detection at 620 nm. DPA reagent was prepared by 2.4 g diphenylamine and 2.4 g aniline in a mixture of 200 ml methanol and acidification with 20 ml phosphoric acid (85%). The eluent used for chromatography was a mixture of 85:15 (v/v) acetonitrile/ water on silica gel glass plates (Silica Gel 60; Merck).
For the verification of protein production, 200 µl of Bugbuster Mastermix (Merck) was added to the pellets and incubated for 15 min on a shaking platform for cell disruption. Samples were used as whole cell extracts or cell debris were separated from the supernatant by centrifugation at 21,000 g. For quantification of the α S1 -casein, Laemli buffer (Bio-Rad) was added to the samples in proportion to the OD measured during cultivation and heated at 95°C for 15 min. For sodium dodecyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE), samples were diluted in 1:20 ratio and applied on conventional 12% polyacrylamide gels or on Mini Protean TGX Gels (Bio-Rad). Gels were stained with coomassie blue staining solution (Thermo Fisher Scientific) and photographed using a gel documentation system (Bio-Rad).
For the measurement of fluorescence, three aliquots of 100 µl of each suspension were added to 96 well flat-bottom polystyrene microplates (Sarstedt), and the fluorescence was measured with a Tecan infinite f200 (Tecan), at an excitation wavelength of 485 nm and an emission wavelength of 535 nm. The background fluorescence was determined by using each corresponding E. coli strain culture carrying the empty vector. The relative fluorescence (%) was calculated according to the highest measurement.
Quantification of protein expression was performed by densitometry scanning. The concentrations of the α S1 -caseins were determined by comparing the densities of the bands to the standard casein (α S1 -casein from bovine milk, C6780; Merck) using the ImageJ software (www.imagej.net/ImageJ1; Schneider et al., 2012). The background from the empty vector was subtracted in the calculation. Error bars for quantitative data for casein during bioreactor cultivation were obtained from the largest observed relative measurement error from all independent biological experiments and transferred to all individual measurements as relative error bars.
Graphical and regression analysis was performed using scientific graphing and data analysis software (SigmaPlot 13.0; Systat Software Inc.).

| Growth of E. coli strains on glucose, xylose and arabinose
Traditionally E. coli is cultivated on D-glucose in laboratories world-wide. However, most if not all typical E. coli strains are known to be capable to grow also on other sugars like D-xylose and L-arabinose as sole carbon sources, which are the predominant sugars of lignocellulose especially in grasses and thus do not represent a direct competitor to food production. With the aim to verify this as a base for potential applications using lignocellulosic biomass as sustainable and cost-efficient carbon source, the growth behaviour of various plasmidless and recombinant E. coli strains on different sugars was examined. Prominent E. coli laboratory strains, namely DH5α as a standard laboratory strain for cloning experiments, K12-MG1655 and K12-W3110 as two of the oldest laboratory strains and BL21(DE3) as the typical overexpression strain, as well as the probiotic strain E. coli Nissle 1917 were cultivated in minimal medium containing 10 g/L glucose, xylose or arabinose as sole carbon source. Samples for measurement of growth performance and sugar consumptions were taken every 3 h (Figure 1). Additionally, the values of the final optical densities after 37 h of cultivation and the timepoints, when each sugar was completely consumed, are listed in Table 2. As expected, the commonly used laboratory strains E. coli DH5α and BL21(DE3) reached the highest cell density growing on glucose, which was about 19% respectively 12% higher in comparison to their growth on xylose and arabinose. Surprisingly, the growth of the E. coli strains MG1655 and W3110 as two more 'wild typical' laboratory strains with less genomic modifications and the probiotic E. coli Nissle 1917 wild type strain was 18%, 14% and 19%, respectively, higher on xylose in comparison to the averaged growth on glucose and arabinose. In all cases, fastest consumption was observed for glucose followed by xylose and arabinose, but there were also differences in the duration a specific E. coli strain needed to completely deplete the different sugars. Once more, the laboratory strains DH5α and E. coli BL21(DE3) showed a conspicuous different behaviour compared to the other three strains and needed a longer time of about 20-25 h to consume all the sugars. In comparison, the other E. coli strains depleted all the sugars 5.5 h faster on average.

| Heterologous protein production on
glucose, xylose and arabinose using enhanced green fluorescent protein as an example For demonstrating the ability to produce recombinant proteins based on the different lignocellulosic sugars, the enhanced green fluorescent protein (eGFP) was chosen as a model protein. This well-established reporter protein originally isolated from the jellyfish Aequorea victoria (Shimomura et al., 1962) is commonly used to provide reliable information about the effectiveness of gene expression in various host organism. Therefore, the plasmid pJOE4056 was hydrolysed using NdeI and HindIII to receive the egfp gene for subcloning into the pET22b vector, which was equally digested. Subsequently, the resulting pET22b-egfp containing egfp under the control of a T7 promoter was hydrolysed by XbaI and HindIII for subcloning of egfp including the ribosomal binding site (RBS) of pET22b into the pVLT31 shuttle vector. In the resulting plasmid pVLT31-egfp the transcriptional regulation of egfp is controlled by an IPTG inducible P tac for expression in E. coli strains, which do not harbour a T7 RNA polymerase in their genome. The resulting recombinant strains E. coli DH5α+pVLT31-egfp, E. coli K12-MG1655+pVLT31-egfp, E. coli K12-W3110+pVLT31-egfp, E. coli BL21(DE3)+ pET22b-egfp and E. coli Nissle 1917+pVLT31-egfp and respective strains containing the corresponding empty vectors were cultivated in minimal medium containing 10 g/L glucose, xylose or arabinose as sole carbon source. Samples were also taken every 3 h for measurement of cell densities and sugar consumption. Additionally, samples were taken before induction as well as 12 and 24 h after induction to verify the successful expression of egfp by SDS-PAGE and fluorescence measurements (Figure 2).
In general, the curve progression of recombinant E. coli strains was quite similar related to growth performances and sugar consumption in comparison to the plasmidless strains in the earlier experiment, wherefore we decided to provide the figure in the Supporting Information section ( Figure S1). However, the finally OD 600 values attained as well as the timepoints when each sugar was completely consumed are listed in comparison to the other cultivation in Table 2. Surprisingly, all recombinant strains reached significantly higher optical densities, while the timepoints, when all sugars were completely consumed, were almost identical to the experiments with the plasmidless strains. Moreover, the growth of strains harbouring the empty vectors was on average increased by 40%, but strains expressing the egfp gene even reached a 61% increased final optical density. The greatest increase was recorded for the growth of E. coli Nissle 1917 expressing egfp, which could even more than double its growth cultivated on arabinose in comparison to the wild type strain. Astonishingly, in this experiment the strain E. coli BL21 reached slightly higher optical densities on xylose than on glucose, but in general all strains showed very effective growth on all three sugars. The successful expression of egfp under both promoters (P T7 in pET22b and P tac in pVLT31) in all the E. coli strains on each single sugar was verified using SDS-PAGE. As exemplarily shown in Figure 2, in the samples taken 12 h after induction from every single strain independent of the used sugar distinct eGFP bands are already visible, indicating a sufficient amount of the protein production. Moreover, there were nearly no differences in the intensity of the bands for each single strain cultivated on the different sugars. However, with the corresponding fluorescence measurements, the differences between each strain were determined ( Figure S2). In particular, the strain E. coli DH5α showed the highest relative fluorescence, which was set to 100%, followed by the other E. coli strains with 70%-90%. Only E. coli Nissle 1917 showed the lowest relative fluorescence of about 60%, but a successful expression of egfp based on typical sugars from lignocellulose was given in all cases.

| Production of bovine and human α S1 -casein on single sugars and artificial wheat straw hydrolysate
For the realization of a potential casein biosynthesis based on lignocellulosic hydrolysates, relevant expression vectors harbouring respective genes were constructed. For the heterologous production of α S1 -casein variants, the efficient T7 expression system was chosen, with the pET22b expression vector and E. coli BL21(DE3) as host organism, which also showed highly effective growth on all single sugars in the earlier experiments. The genes for α S1 -casein from bovine and human resources were reverse translated from their amino acid sequences (http://www.unipr ot.org/unipr ot/P02662 and http://www.unipr ot.org/unipr ot/P47710) and optimized for codon usage of E. coli. The gene sequence for bovine α S1 -casein and the human α S1 -casein has sizes of 645 and 558 bp respectively. As additional elements a His-affinity tag (six consecutive histidine residues) and an enterokinase recognition site were added to be present at the N-terminus of the final protein. Both synthetic genes, designated as bc for bovine and hc for human α S1 -caseins, respectively, were individually synthesized and provided in the pET100/D-TOPO vectors by Gene Art (Regensburg, Germany). Their sequences are given in the Supporting Information section ( Figure S3). The received plasmids were hydrolysed with NdeI and HindIII to obtain fragments including the encoding region and additional functional sites and subsequently ligated into the pET22b expression vector, which was similarly digested. The resulting recombinant expression plasmids were designated as pET22b-bc and pET22b-hc, with the α S1 -casein genes under the control of a T7 promotor.
The cultivation of the recombinant strains E. coli BL21(DE3)+pET22b, E. coli BL21(DE3)+pET22b-bc and E. coli BL21(DE3)+pET22b-hc on 10 g/L glucose, xylose and arabinose as sole carbon sources was carried out as described before. All the recombinant strains also showed successful growth on the three single sugars and curve progressions are comparable with the earlier experiments with the BL21(DE3) strain (Table 3), thus corresponding graphs are provided in the Supporting Information section ( Figure S4).
Specifically, the E. coli BL21(DE3) strain harbouring the empty vector reached nearly identical optical densities as before. Remarkably, also in this experiment the recombinant strain expressing the target protein, in this case the two α S1 -casein variants, showed better growth on all three sugars and reached on average 19% higher optical density as the strain containing the empty vector.
In addition, samples for analysing the effective production of α S1 -caseins in the recombinant strains on the three  Figure 3, the bovine and human α S1 -caseins were both successfully produced on glucose, xylose and arabinose and the bands from the samples on various sugars at specific time points showed comparable intensities. Both proteins are already visible as dominant bands in SDS-PAGE at 6 h after the induction indicating their efficient biosynthesis. Samples taken before induction from all the strains and 24 h after induction of the empty vector served as control.
To investigate the possible utilization of lignocellulosic hydrolysates by the recombinant E. coli BL21(DE3) strains, we first applied an artificial hydrolysate with a typical sugar composition as in wheat straw hydrolysate (6.29 g/L glucose, 3.30 g/L and 0.41 g/L arabinose). Due to its high annual tonnage, especially in Europe, wheat straw could become an excellent feedstock provided through hydrolysis for biotechnological applications.
All recombinant E. coli strains were able to grow on these artificial straw hydrolysates with high cell densities ( Figure  4; Figure S5; Table 3). The low amounts of arabinose were always consumed first and, surprisingly, already parallel to the utilization of glucose as the presumed preferred carbon source. However, the utilization of xylose started not until glucose was completely consumed. Again, the E. coli strains expressing the α S1 -casein variants showed better growth performance and reached about 13% higher optical density than the strain carrying the empty vector.
The successful production of bovine α S1 -casein and human α S1 -casein proteins was again verified using SDS-PAGE ( Figure 5). Samples were taken prior to the induction of gene expression and 6, 12 and 24 h after induction. The dominant bands of both casein variants are visible and had higher intensities as corresponding bands in the empty vector indicating the efficient production of casein based on the artificial hydrolysates.
Figures 3 and 5 display the successful production of bovine α S1 -casein and human α S1 -casein proteins on each single sugar and artificial wheat straw hydrolysate. For determining the casein concentrations for both variants at each timepoint additional SDS-PAGE was carried out containing casein standard samples in known concentrations (Table 4). In this way, amounts of the expressed proteins under different cultivation conditions were densitometrically calculated using the ImageJ software (Schneider et al., 2012). The densities of the bands from the empty vector were subtracted in the calculations as background.
For the production of both α S1 -casein variants, there are some characteristics in common. Almost all cultures achieved the highest protein concentration already after 12 h of induction of gene expression or with only marginal further increases within the next 12 h. The achieved amounts of bovine casein were nearly identical using the different single sugars as carbon source, but increased by about 58% when this strain was cultivated on artificial wheat straw hydrolysates. In contrast, Optical density after 37 h 5.5 ± 0.07 6.8 ± 0.27 4.6 ± 0.09 6.8 ± 0.15 6.5 ± 0.08 7.6 ± 0.07 5.3 ± 0.1 7.7 ± 0.06 6.4 ± 0.12 7.5 ± 0.08 6.4 ± 0.08 the biosynthesis of human casein is quite similar on glucose as single sugar and the sugar mixture, but the concentrations were about 44% higher than on xylose and arabinose. In general, these data indicate that mixtures of sugars as typical for lignocellulosic hydrolysates are preferable carbon sources for the successful biosynthesis of target proteins like in this case α S1 -caseins for recombinant E. coli strains.

| Application of real wheat straw
hydrolysates for the production of human α S1 -casein The final proof, that lignocellulosic hydrolysates serve as efficient and sustainable carbon source for the production of α S1 -casein, a bioreactor cultivation using E. coli BL21(DE3) expressing the gene for human α S1 -casein was applied using real wheat straw hydrolysates obtained from steam explosion followed by enzymatic hydrolysis process ( Figure 6). The recombinant strain exhibited exponential growth following a short adaptation time to the lignocellulose hydrolysate-based medium resulting in maximum specific growth rates of up to 0.87 1/h after 6 h. The IPTG-based induction of heterologous gene expression was performed after 5 h of cultivation. As a consequence, a drop in growth rate was observable after 7 h. Casein was detected in the insoluble protein aggregate fraction after cell disruption, rising to a maximum of 1.45 g/L after 9 h. About 2 h post induction, a sharp decrease in specific growth rate was observed, which correlates with the onset of protein formation. With the depletion of glucose after approximately 10 h, xylose metabolization started. Cellular dry weight increased to a maximum of 4.8 g/L while casein concentration decreased to approximately 0.7 g/L after complete consumption of all metabolizable sugars in the medium.
F I G U R E 3 Sodium dodecyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE) of bovine and human α S1 -casein production in different recombinant Escherichia coli strains. The production of recombinant bovine α S1 -casein (a) and human α S1 -casein (b) expressed in E. coli BL21(DE3) from the cultivation in Wilms-KPi medium containing 10 g/L of different monosaccharides. Samples of bovine α S1 -casein and human α S1 -casein are shown from different time points: before induction, 6, 12 and 24 h after induction. Samples of empty vector 24 h after induction are shown as a control F I G U R E 4 Growth performance of α S1 -casein expressing recombinant Escherichia coli BL21(DE3) strains on artificial wheat straw hydrolysate. The strains were cultivated in Wilms-KPi medium containing artificial wheat straw hydrolysate with 10 g/L as total amount of sugars (6.29 g/L glucose, 3.30 g/L and 0.41 g/L arabinose). The results for E. coli BL21(DE3)+pET22b are shown to be exemplary, other data are deposited as Supporting Information. Growth curves are shown as filled diamonds. The consumption of glucose (empty squares), xylose (empty circles) and arabinose (empty triangles) is presented. The symbols indicate the averages of the results of triplicate measurements. Error bars represent the standard deviations F I G U R E 5 Sodium dodecyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE) of recombinant bovine α S1 -casein and human α S1 -casein expressed in recombinant Escherichia coli BL21(DE3) strains. The strains were cultivated in Wilms-KPi medium containing artificial wheat straw hydrolysate with a total sugar amount of 10 g/L and a composition of 6.29 g/L glucose, 3.30 g/L xylose and 0.41 g/L arabinose. Samples of empty vector, bovine α S1 -casein and human α S1 -casein are shown from different time points: before induction and 6, 12 and 24 h after induction

| DISCUSSION
Its broad metabolic versatility enables E. coli to grow especially on multiple different sugars including various hexoses and pentoses. These sugars constitute the main components in lignocellulosic biomass, a sustainable and world-wide available carbon source as wood, food and agricultural residues (Anwar et al., 2014;Jørgensen et al., 2007;Lange, 2007;Naik et al., 2010;Van Dyk et al., 2013). Other bacterial organisms typically used in biotechnological applications are often incapable for utilizing pentoses and require previous genomic modifications to grow on lignocellulosic sugars (Chen et al., 2013;Kawaguchi et al., 2008;Le Meur et al., 2012;Wang et al., 2019), especially if the target product is difficult to produce in E. coli, e.g. biosurfactants (Bator et al., 2020;Cabrera-Valladares et al., 2006;Wittgens & Rosenau, 2018). On the other hand some more 'exotic' organisms, e.g. Cellvibrio japonicus are even able to metabolize complex lignocellulosic polymers, but they are not sufficiently developed and thus represent less convenient hosts for biotechnological applications (Gardner, 2016;. In this study, different E. coli strains exhibited effective growth on single glucose, xylose and arabinose sugars as the main components of lignocellulosic hydrolysates from grasses with only minor differences. Surprisingly, the traditional E. coli strains K12-MG1655, W3110 and the probiotic Nissle 1917 grew better on xylose than on glucose and arabinose and consumed all the sugars faster in comparison to the present commonly used laboratory strains DH5α and BL21(DE3). Presumably, these modern strains underwent some undocumented modifications probably as side effects of a sort of evolutionary process during generations of laboratory treatments, which led to the different growth behaviour. Adaptive laboratory evolution is an efficient tool to force organisms and strains to improve particular desired properties like pH or solvent tolerance, acceptance of exceptional carbon sources and altered growth rates (Du et al., 2020;LaCroix et al., 2015;Meijnen et al., 2008;Mohamed et al., 2019). Our observations for the laboratory strains could be explained by these processes, since glucose is the most common carbon source used in biotechnology (Wendisch et al., 2016) and slower bacterial growth can be beneficial for the effective biosynthesis of recombinant proteins (Papaneophytou & Kontopidis, 2014;Yee & Blanch, 1992).
The utilization of different sugars is hierarchically organized in E. coli, which is visible during cultivation on sugar mixtures and real lignocellulosic hydrolysates. Based on the native regulatory mechanisms, glucose represents the preferred carbon source, while the utilization of xylose did not begin until glucose was completely consumed. The reason for this is the transcriptional regulation of the xylAB and araBAD operons, which are responsible for the metabolization of xylose and arabinose, respectively, by transcription factors of the AraC/XylS family and the cAMP-CRP interaction (Gallegos et al., 1997). Surprisingly, the small amounts of arabinose decreased already parallel to the utilization of glucose. Perhaps, arabinose is not already metabolized simultaneous with glucose, but already taken up by the fast increasing numbers of cells in this growth phase. E. coli possesses specific transporters for the uptake of xylose and arabinose encoded by xylE and araE (xylose/arabinose:H + symporters) as well as xylFGH and araFGH (xylose/arabinose ABC T A B L E 4 Concentration of bovine and human α S1 -casein proteins in the medium at different time points during cultivation on glucose, xylose and arabinose as single sugars as well as artificial wheat straw hydrolysate Casein formation is indicated by grey bars and specific growth rates (filled circles) along with vertical lines indicate changes in growth behaviour and change in sugar consumption respectively transporters), whose expressions are not repressed by AraC/ XylS in contrast to the operons for metabolizing these sugars (Gallegos et al., 1997). The high effective production of recombinant proteins based on lignocellulosic sugars was initially demonstrated using the fluorescence reporter eGFP as a prominent model. Astonishingly, the recombinant E. coli strains, those expressing eGFP rather than those harbouring the corresponding empty vector, reached significantly higher cell densities during the cultivations. Usually, the selection pressure by the added antibiotics and the constitutive expression of antibiotic resistance genes in response as well as the general plasmid maintenance induce a stress response, especially when an additional target protein is highly expressed (Hoffmann & Rinas, 2004;Hoffmann et al., 2002). Such improved growth performances of recombinant strains were frequently described, but the reasons are very speculative (Yee & Blanch, 1992). However, the response to the high expression of recombinant protein includes an extensive reprogramming of gene expression patterns and down regulation of several housekeeping genes (Hoffmann et al., 2002), which probably provides more energy resources for other cellular functions.
This improved growth behaviour was further observed during the heterologous expression of bovine and human α S1 -casein in E. coli BL21(DE3). In addition to its highlyefficient protein biosynthesis depending on the T7 expression system, this host was chosen because it lacks the Lon and the OmpT proteases (Studier & Moffatt, 1986). Furthermore, the successful production of other recombinant milk proteins was also reported for this strain (García-Montoya et al., 2013;Goda et al., 2000;Hansson et al., 1993;Ponniah et al., 2010;Simons et al., 1993;Wang et al., 1989). However, this strain still possesses further proteases like the ATP-dependent Clp proteases, which are known to degrade a variety of proteins through multiple site cleavage (Thompson et al., 1994). The α S1 -caseins are known to be inherently sensitive to proteases in general, since they have only less secondary structures and lack disulphide bonds (Kumosinski et al., 1991), which explains the reduction in casein concentration after the initial spike during the fermentation process. With a pK a of 4.6, α S1 -casein is highly soluble in alkaline or strongly acidic environments, but less soluble under the applied cultivation conditions (Post et al., 2012) and a high content of proline is increasing the hydrophobicity of casein (Gordon et al., 1950). Therefore, the synthesized casein was mainly involved in the formation of inclusion bodies, which have been reported for other mammalian recombinant proteins such as insulin and interleukin in E. coli, too (Chrunyk et al., 1993;Williams et al., 1982). Furthermore, in its natural environment, α S1casein folding is typically aided by other structural milk proteins, inorganic ions and hydrophobic substances, which results in a micellar structure (Dalgleish, 1998). Therefore, a deviation from its natural folding pattern in a high-efficiency heterologous expression system was expected. Regardless of its reported masses of 24.53 kDa for bovine and 21.67 kDa for bovine casein, corresponding bands were detected at approximately 35 kDa on the SDS gel. This effect is caused by the additional N-terminal elements, which increase the masses by 3.90 kDa, and local high negative charges, leading to an expended structure in the presence of SDS (Creamer & Richardson, 1984).
The factors affecting the capacity and cost associated with the production of a recombinant protein using E. coli was investigated in recent studies (Cardoso et al., 2020;Ferreira et al., 2018). The authors report prices between 35$ and 350$ per kg of protein, with the nitrogen source being the dominant part of total cost breakdown. Sugar production costs for hydrolysates gained via steam exploded straw can be estimated with 0.43$ per kg sugar (Baral, & Shah, 2017). It should be considered however, that these studies assume optimized steps of the process chains, especially downstream processing, which has a major influence of the final product price on its own. Furthermore, it should be noted, that for human casein or other difficult to obtain products, competitive pricing of heterologous production is less in the focus as biotechnological production is the only reasonable method of obtaining.
In conclusion, this approach successfully demonstrated how valuable (food) proteins can be synthesized based on bulk residual biomass in a cost-effective and economically efficient bioprocesses to achieve a truly sustainable bioeconomy in the future.