Augmented generation of protein fragments during wakefulness as the molecular cause of sleep: a hypothesis



Despite extensive understanding of sleep regulation, the molecular-level cause and function of sleep are unknown. I suggest that they originate in individual neurons and stem from increased production of protein fragments during wakefulness. These fragments are transient parts of protein complexes in which the fragments were generated. Neuronal Ca2+ fluxes are higher during wakefulness than during sleep. Subunits of transmembrane channels and other proteins are cleaved by Ca2+-activated calpains and by other nonprocessive proteases, including caspases and secretases. In the proposed concept, termed the fragment generation (FG) hypothesis, sleep is a state during which the production of fragments is decreased (owing to lower Ca2+ transients) while fragment-destroying pathways are upregulated. These changes facilitate the elimination of fragments and the remodeling of protein complexes in which the fragments resided. The FG hypothesis posits that a proteolytic cleavage, which produces two fragments, can have both deleterious effects and fitness-increasing functions. This (previously not considered) dichotomy can explain both the conservation of cleavage sites in proteins and the evolutionary persistence of sleep, because sleep would counteract deleterious aspects of protein fragments. The FG hypothesis leads to new explanations of sleep phenomena, including a longer sleep after sleep deprivation. Studies in the 1970s showed that ethanol-induced sleep in mice can be strikingly prolonged by intracerebroventricular injections of either Ca2+ alone or Ca2+ and its ionophore (Erickson et al., Science 1978;199:1219–1221; Harris, Pharmacol Biochem Behav 1979;10:527–534; Erickson et al., Pharmacol Biochem Behav 1980;12:651–656). These results, which were never interpreted in connection to protein fragments or the function of sleep, may be accounted for by the FG hypothesis about molecular causation of sleep.


The modern era of sleep research began with the 1949 discovery of the reticular activating system in the brainstem that controls wakefulness and sleep.1 Over the last six decades, great strides have been made in studies of sleep regulation. However, the molecular-level function of sleep remains unknown. This lack of understanding is particularly remarkable vis-à-vis the broad knowledge about neuronal circuits that control sleep in animals, including humans. Sleep appears to be universal among vertebrates, from fishes to mammals. Sleep-like states are also present in invertebrates such as the fly Drosophila melanogaster (∼105 neurons) and even the nematode Caenorhabditis elegans (∼300 neurons). Although it is likely that all vertebrates sleep and that sleep in Drosophila and other invertebrates is fundamentally similar to mammalian sleep, these issues remain to be settled definitively. Neuronal pathways that regulate sleep are linked to circadian circuits that control daily rhythms. However, circadian aspects of sleep do not define it entirely, because sleep has, in addition, specific homeostatic properties, described below. The evident complexity of sleep, including its distinct stages in mammals, such as rapid eye movement (REM) and nonrapid eye movement (NREM) sleep, suggests that sleep may have several causes and functions. A new hypothesis described below considers, in the spirit of Occam's razor, one molecular function of sleep.

In mammals, sleep can be initiated in small networks of neurons in the cortex and other regions of the brain.2–6 Despite profound anatomical differences and size disparities between, for example, the human and Drosophila brains, the molecular designs of neurons are highly similar in these divergent organisms. Given this experimental background, I suggest that the fundamental cause and function of sleep may reside in individual neurons, as distinguished from neuronal networks. There is no strong evidence either for or against this conjecture, in part because sleep has been defined, thus far, either behaviorally or at the level of neuronal assemblies. The probability of awake-to-sleep transition in a neuronal network (including, possibly, single neurons of that network) is increased by the preceding electrochemical activity of these neurons. In this view, sleep that is evident behaviorally is an emergent state that coalesces from local sleep foci that might be, initially, single neurons.

Sleep is either counteracted or facilitated by specific neuronal circuits that employ wakefulness-enhancing neurotransmitters such as orexins and somnogenic (sleep-promoting) compounds such as extracellular adenosine. Viewing sleep as a process initiated in small neuronal networks and/or individual neurons suggests that sleep-regulating circuits have been sculpted, on evolutionary timescales, by selection pressures to mitigate cognition-perturbing cellular stress, i.e., the molecular cause of sleep. Discovery of this cause is expected to account, among other things, for sleep's homeostatic character. Animals deprived of sleep exhibit impaired cognition, increased propensity to sleep (sleep drive), and respond to the incurred sleep “debt” through a longer and deeper subsequent sleep. For example, humans not only sleep longer after sleep deprivation but also spend more time in NREM sleep, particularly in a part of it called deep sleep, characterized by slow-wave activity in the electroencephalogram (EEG), by the lowest mean frequency of action potentials, and by the slowest brain metabolism, in comparison with other stages of sleep, and with wakefulness as well. For reviews of sleep, see Refs. 3, 4, 7–40.

Previously suggested functions of sleep include the null hypothesis,41, 42 i.e., the conjecture that sleep is a peculiar form of indolence that has little adaptive value. Cogent arguments against this possibility were discussed previously.18, 43, 44 Other ideas about the functions of sleep are of two kinds: (i) molecular-level and metabolic conjectures; (ii) hypotheses with key assumptions at higher than molecular levels, for example, specific alterations in the strength of synaptic connections or other modifications of neuronal circuits.

One molecular-level hypothesis about the function of sleep is that during wakefulness the energy charge (a measure of ATP concentration) in the brain becomes sufficiently low to require a functionally distinct period, sleep, that restores the energy charge.45 This conjecture is extant in the field, in the absence of a definitive positive or negative evidence. For example, glucose metabolism in the brain cortex (but not, for example, in the hippocampus or hypothalamus) is decreased during NREM sleep.46 On the other hand, oxidative phosphorylation is upregulated during sleep deprivation, suggesting an increase in ATP production.47 In addition, there is no convincing evidence that neuronal energy gains during sleep are significant enough to be functionally relevant (for a recent discussion, see Refs. 48–51).

It was also suggested that extracellular adenosine is a natural somnogenic compound that contributes to the initiation and maintenance of sleep.11, 21, 45, 52, 53 Extracellular adenosine is derived in part from intracellular adenosine via equilibrative adenosine transporters. Adenosine is also secreted through regulated vesicular exocytosis.54 Another source of extracellular adenosine is ATP, which can be released from cells either by vesicular exocytosis or through specific transmembrane channels, followed by conversion of ATP to adenosine in the extracellular space.54–59 Adenosine facilitates sleep in part through binding to the neuronal G protein-coupled adenosine receptors. Recent evidence strongly suggests a role for extracellular adenosine as an endogenous somnogen.53, 57, 60–64

ATP can act as a somnogen not only through its conversion to adenosine but also through the binding of extracellular ATP to P2-type transmembrane receptors, a step that leads to the release, largely from microglial cells (brain phagocytes), of cytokines that include interleukin-1β (IL1β) and tumor necrosis factor (TNF)-α.56, 65, 66 As described below, these and other inflammatory cytokines can act as somnogens, indicating a major but incompletely understood connection between the immune system and sleep.4, 16, 29, 31, 36, 67–79 Another potent endogenous somnogen is the PGD2 prostaglandin, which preferentially induces NREM sleep.11, 28, 80, 81 The somnogenic activity of PGD2 is mediated at least in part by the ability of brain-produced PGD2 to increase the concentration of extracellular adenosine.11, 28, 81

Small compounds such as adenosine, ATP, and PGD2, proteins such as inflammatory cytokines, and several other endogenous effectors that act as somnogens are a part of the growing understanding of specific circuits that initiate, maintain, and regulate sleep. However, none of this understanding addresses, by itself, the underlying molecular cause and molecular-level function of sleep, which remain unknown.

Nonmolecular conjectures about the functions of sleep have in common the concept of sleep as a means of modifying synaptic connections in ways that optimize their activities during wakefulness. One example of these hypotheses is the suggestion that the function of REM sleep is to minimize deleterious modes of interaction among neurons in the cerebral cortex.82, 83 It was also proposed that sleep is influenced by prior synaptic use, and functions to alter synaptic connections, in part through the local production of somnogenic compounds that include cytokines.84, 85 Another conjecture is that sleep acts to downscale the strengths of synaptic connections, so that increases in synaptic strength during wakefulness stay below mechanistically infeasible limits.8, 26, 86–89 This hypothesis is related to the earlier demonstration that the firing by neurons and the strengths of their synapses are homeostatically increased or decreased to maintain firing rates within certain boundaries.90 The synaptic downscaling hypothesis posits, in effect, that the previously characterized recalibration of synaptic strengths90 does not suffice during wakefulness, and that sleep is required for a further downscaling.26 Studies on a role of sleep in memory and the evidence for such a role91–95 are also based on nonmolecular models, in part because we still do not know what exactly a long-term memory trace is. In sum, even if these plausible nonmolecular conjectures about sleep are eventually found to be correct in their domains, they would not, by themselves, identify the fundamental molecular problem that the function of sleep may have to address. “May” in the preceding sentence refers to the still extant possibility that the cause of sleep is not reducible to molecular perturbations in individual neurons and may be, instead, an evolved response of neuronal assemblies to “high-level” parameters of such networks, as distinguished from changes in individual neurons that result from molecular stress during wakefulness.

In contrast, the (unproven) premise of the present paper is that the fundamental cause of sleep originates in individual neurons. Described below is a molecular-level hypothesis about the causation and function of sleep, specific predictions of this hypothesis, and new explanations of the properties of sleep, including its evolutionary persistence and homeostatic character.


In considering causes of sleep, it might be helpful to view it as a metacellular (organismal) adaptation to a specific cellular stress during wakefulness. Responses to sleep deprivation suggest that sleep functions to counteract a cognition-perturbing molecular process that occurs in neurons, “builds up” during wakefulness, causes the initiation of sleep, can prolong its duration and intensity, and is downregulated during sleep. If so, and if sleep is an adaptive response to a cellular stress that impairs brain circuits, what is the molecular nature of a key perturbation that renders sleep necessary? Furthermore, how does sleep deal with this perturbation? There is also an evolutionary problem: given the maladaptive dimension of sleep, why was it impossible to eliminate or bypass the (unknown) molecular cause of sleep in the course of evolution? Deleterious aspects of sleep include a prolonged and fitness-reducing neglect of mating opportunities, and also the combination of immobility, eye closure, and higher arousal thresholds that results in a decreased vigilance. Save for a minority of animals such as adult top predators, a diminished vigilance of a sleeping animal usually signifies its increased vulnerability to predation. Because suppression of consciousness during sleep contributes to decreased vigilance, this suppression, too, is maladaptive at least behaviorally. Given these downsides of sleep, why not, for example, a quiet wakefulness, instead of sleep?

I suggest that the molecular cause and function of sleep originate in individual neurons and stem from increased production of protein fragments during wakefulness. These fragments, a multitude of them, are produced in all cells, including neurons, and are transient parts of protein complexes that initially contain full-length precursors of fragments (Fig. 1). Specific cleavage sites in proteins that give rise to fragments are maintained by positive selection during evolution, because many fragments have adaptive (fitness-increasing) functions. At the same time, at least some fragments can also perturb brain circuits. As described below, this (previously not considered) dichotomy between adaptive and maladaptive aspects of protein fragments can account for the evolutionary persistence of sleep.

Figure 1.

Generation, targeting and degradation of protein fragments, and remodeling of cleaved protein complexes. This diagram illustrates a multistage process that is expected to occur in a great variety of structural contexts and is a part of reactions encompassed by the fragment generation (FG) hypothesis. The process begins with a proteolytic cleavage (lightning arrow) of an oligomeric (in this example, heterodimeric) protein by a nonprocessive protease such as, for example, calpain or caspase. The process continues with the subunit-selective degradation of a resulting C-terminal fragment either by the Arg/N-end rule pathway (in this example) or by another proteolytic pathway, and ends with reconstitution of the initial heterodimer. Such processes can involve both “soluble” and transmembrane proteins (Figs. 3–6). The arrangement of two subunits in the heterodimer is (arbitrarily) antiparallel, with N-termini and C-termini denoted by “N” and “C,” respectively. The rate constants k1–k6 illustrate the possibility of independent regulation of specific reaction steps. In this example, the N-terminal residue of the C-terminal fragment (it remains associated with the other subunit of heterodimer) is Gln (Q), a tertiary destabilizing residue [Fig. 2(A)]. The C-terminal fragment is sequentially modified by the Ntaq1 NQ-amidase and the Ate1 R-transferase (specific enzymes of the Arg/N-end rule pathway) [Fig. 2(A)], followed by polyubiquitylation of the fragment by N-recognins (ubiquitin ligases of the Arg/N-end rule pathway) such as the UBR1 E3 (only E3 component of the holoenzyme ligase is shown) and the proteasome-mediated degradation of the targeted C-terminal fragment. For illustrative purposes, each of two subunits is shown to consist of two domains. The N-terminal fragment of the cleaved subunit may also be targeted for degradation, either by the Ac/N-end rule pathway (via an Ac/N-degron of the fragment) in this example, [Fig. 2(B)], or by another proteolytic pathway. See Fig. 2 and the main text for additional details. [Color figure can be viewed in the online issue, which is available at]

Figure 2.

The mammalian N-end rule pathway.122, 124, 125, 126 N-terminal residues are indicated by single-letter abbreviations for amino acids. A yellow oval denotes the rest of a protein substrate. E3 ubiquitin ligases of the N-end rule pathway are called N-recognins. A: The Arg/N-end rule pathway. “Primary,” “secondary,” and “tertiary” denote mechanistically distinct subsets of destabilizing N-terminal residues. C* denotes oxidized N-terminal Cys, either Cys-sulfinate or Cys-sulfonate, produced in vivo by reactions that require nitric oxide (NO) and oxygen. Oxidized N-terminal Cys is arginylated by ATE1-encoded isoforms of arginyl-tRNA-protein transferase (R-transferase), which also arginylates N-terminal Asp (D) and Glu (E). N-terminal Asn (N) and Gln (Q) are deamidated by the NTAN1-encoded NtN-amidase and the NTAQ1-encoded NtQ-amidase, respectively. In addition to the binding sites that recognize primary destabilizing N-terminal residues, the UBR1, UBR2, UBR4 and UBR5 (EDD) N-recognins contain binding sites for substrates (denoted by a larger oval) that lack N-degrons and have internal (non-N-terminal) degradation signals.122, 126 Polyubiquitylated Arg/N-end rule substrates are degraded to short peptides by the 26S proteasome. Hemin (Fe3+-heme) binds to R-transferase, inhibits its arginylation activity and accelerates its in vivo degradation.122, 126, 127 Hemin also binds to UBR-type N-recognins.127 Regulated degradation of specific proteins by the Arg/N-end rule pathway mediates the sensing of heme, NO, and oxygen; the elimination of misfolded proteins; the regulation of DNA repair; the fidelity of chromosome cohesion/segregation; the signaling by G proteins; the control of peptide import; the regulation of apoptosis, meiosis, viral infections, fat metabolism, cell migration, actin filaments, cardiovascular development, spermatogenesis, neurogenesis and memory; the functioning of adult organs, including the brain, muscle, testis and pancreas; and many functions in plants (Refs. 122–124, 126, 128, 129, 130, 131, 132, and references therein). B: The Ac/N-end rule pathway. Although it is clear that this pathway is present in all or most eukaryotes,122 it has been characterized, thus far, only in yeast.125 This diagram illustrates the mammalian Ac/N-end rule pathway through an extrapolation from its S. cerevisiae version. Red arrow on the left indicates the removal of N-terminal Met by Met-aminopeptidases (MetAPs). This Met residue is retained if a residue at position 2 is nonpermissive (too large) for MetAPs. If the retained N-terminal Met or N-terminal Ala, Val, Ser, Thr, and Cys are followed by acetylation-permissive residues, the above N-terminal residues are Nt-acetylated by ribosome-associated Nt-acetylases.133 The resulting N-degrons are called Ac/N-degrons.125 The term “secondary” refers to the necessity of modification (Nt-acetylation) of a destabilizing N-terminal residue before a protein can be recognized by a cognate N-recognin. Although the second-position Gly or Pro residues can be made N-terminal by MetAPs, few proteins with N-terminal Gly or Pro are Nt-acetylated.133 [Color figure can be viewed in the online issue, which is available at]

Figure 3.

Calpain-generated C-terminal fragments of mammalian proteins that are either identified or predicted substrates of the Arg/N-end rule pathway. Each entry cites, on the left, a C-terminal (Ct) fragment and its N-terminal (Nt) residue (in red, using three-letter abbreviations for amino acids), followed by a brief description of the full-length (uncleaved) precursor protein. The right side of each entry shows the cleavage site of a full-length protein, using single-letter abbreviations for amino acids. An enlarged residue name, in red (preceded by an arrowhead denoting the cleavage site), indicates the P1′ residue, that is, the residue that becomes Nt upon the cleavage. Unless stated otherwise, the residue numbers are of human proteins. Two indicated residue numbers are the number of the first shown residue of a full-length protein and the number of its last residue, respectively. Most proteins on the list are present at least in neurons. These 34 proteins encompass the bulk of the previously characterized calpain substrates whose cleavage sites have been mapped and whose Ct fragments bear Nt residues that can be recognized by the Arg/N-end rule pathway [Fig. 2(A)]. Not shown are ∼20 calpain-generated, previously mapped Ct fragments bearing Nt residues that are not recognized by the Arg/N-end rule pathway. The Nt residues of this class, including Ala, Ser, and Thr, can be recognized by the Ac/N-end rule pathway [Fig. 2(B)] if these residues can be Nt-acetylated after having become Nt through a calpain-mediated cleavage. Whether these post-translationally generated N-terminal residues can be efficaciously Nt-acetylated in vivo remains to be determined. In addition, there are ∼40 other identified mammalian calpain substrates in which the exact locations of cleavage sites are unknown. Two calpain-generated Ct fragments, Asp-BCLXL (#16) and Arg-BID (#17), have been shown to be short-lived substrates of the Arg/N-end rule pathway.123 Other Ct fragments on this list are predicted Arg/N-end rule substrates. The recent finding that 10 out of 10 Ct fragments (generated by caspases or calpains) that were predicted to be Arg/N-rule substrates were actually found to be such123 suggests that most Ct fragments on the present list are also degraded by the Arg/N-end rule pathway. Calpain-generated fragments: #1. Tyr-mGluR1α is the Ct fragment of the mGluR1α subunit of the transmembrane metabotropic glutamate receptor.134 Receptors containing the calpain-truncated mGluR1α subunit could elevate cytosolic Ca2+ but could not activate PI3K-Akt signaling pathways, in contrast to uncleaved receptors.134, 135 #2. Leu-NR2A is the Ct-fragment of the NR2A subunit of the transmembrane ionotropic glutamate receptor (NMDAR).136 The NR2B subunit of NMDAR can also be cleaved by calpains.137 Ct fragments of NR2A and NR2B contain domains required for the association of these subunits with synaptic proteins. NMDAR receptors lacking the Ct region of NR2A could function as glutamate-gated Ca2+ channels but the intracellular traffic of cleaved receptors and their electrophysiological properties were altered.138 #3. Lys-ATP2B2 is the Ct fragment of the transmembrane ATP2B2 plasma membrane Ca2+ pump (PMCA) that ejects Ca2+ from cells. This pump is activated either by the binding of Ca2+/calmodulin or by the calpain-mediated truncation of ATP2B2 that generates the Lys-ATP2B2 fragment and thereby activates the pump.139 #4. Gln-RYR1 is the Ct fragment of the RYR1 ryanodine receptor, a Ca2+ channel in the ER140 that mediates the efflux of Ca2+ from the ER into the cytosol. Calpain-mediated cleavage of RYR1 increases Ca2+ efflux.141 #5. Gln-EGFR is one of the calpain-generated Ct fragments of the transmembrane epidermal growth factor (EGF) receptor protein kinase.142 Remarkably, all seven calpain cleavage sites in the cytosol-exposed domain of the 170-kDa EGFR contain P1′ residues (which become Nt upon a cleavage)142 that are destabilizing in the Arg/N-end rule [Fig. 2(A)]. #6. Asn-Cav1.1 is the Ct fragment of the voltage-gated transmembrane Ca2+ channel. This (apparently) calpain-generated fragment is noncovalently associated with the rest of the channel and can inhibit its activity. Upon dissociation from the channel, the Asn-Cav1.1 fragment migrates to the nucleus and functions as a transcriptional regulator.97, 143, 144, 145 #7. Arg-GlyT1A is the Ct fragment of the transmembrane GlyT1A glycine transporter.146 Another Gly transporter, GlyT1B, is also cleaved by calpains, yielding the Arg-GlyT1B fragment.146 These Ct fragments are still active as transporters but are impaired in their ability to remove Gly (an inhibitory neurotransmitter) from synaptic clefts.146 #8. Leu-RAD21 is the Ct-fragment of the SCC1/RAD21 subunit of the chromosome-associated cohesin complex.147 Calpain-mediated generation of Leu-RAD21 contributes to the control of chromosome cohesion/segregation, together with processes that include the separase-mediated cleavage of the same RAD21 subunit147, 148, 149, 150 [see also Fig. 6(F)]. #9. Lys-cortactin is the Ct fragment of cortactin, an actin-binding protein that regulates actin polymerization.151 #10. Leu-vimentin is the Ct fragment of vimentin, a component of intermediate filaments.152 #11. Arg-dystrophin is the Ct fragment of a major cytoskeletal protein in the skeletal muscle.153 #12. Gln-talin is the Ct fragment of talin, an adaptor protein that interacts with the integrin family of cell adhesion transmembrane proteins.139, 154, 155 #13. Leu-NF2 is the Ct fragment of NF2 (merlin), a tumor suppressor and cytoskeletal protein. Loss-of-function NF2 mutants result in autosomal-dominant neurofibromatosis, a predisposition to specific kinds of brain tumors156 [see also Fig. 6(E)]. #14. Leu-troponin T2 is the Ct fragment of the cardiac troponin T that is produced by calpain-1 from the troponin-containing cardiac myofibril complex.157 #15. Glu-BAK is the Ct fragment of the proapoptotic regulator BAK. Glu-BAK is generated by calpain-1 in vitro and may be formed in vivo as well.158 #16. Asp-BCLXL is the Ct fragment of the BCLXL antiapoptotic protein. In contrast to its full-length precursor, the Asp-BCLXL fragment is proapoptotic, and has been shown to be a short-lived substrate of the Arg/N-end rule pathway.156 #17. Arg-BID is the Ct fragment of the proapoptotic BID regulator. The Arg-BID fragment is also proapoptotic, and in addition a short-lived Arg/N-end rule substrate.156 #18. Asn-DSCR1 (RCAN1) is the Ct fragment of the Down syndrome critical region 1 protein DSCR1, which binds to Raf1, inhibits the phosphatase activity of calcineurin, and enhances its degradation. Calpain-generated Asn-DSCR1 does not bind to the Raf1 kinase.159 #19. Arg-c-FOS is the Ct fragment of the c-FOS transcriptional regulator. c-FOS is targeted for degradation through more than one degron, including the path that includes the cleavage by calpains160 and predicted degradation of Arg-c-FOS by the Arg/N-end rule pathway. #20. Arg-MEF2D is the Ct fragment of the MEF2D myocyte enhancer factor 2D, a transcriptional regulator that contributes to neuronal survival, development, and synaptic plasticity.161 #21. Leu-STEP33 is the Ct fragment of the striatal-enriched STEP61 phosphatase, a brain-specific Tyr-phosphatase whose substrates include the MAPK-family kinases ERK1/2 and p38. Calpain-generated Leu-STEP33 fragment lacks phosphatase activity.162 #22. Leu-β-catenin is the Ct-fragment of β-catenin, a conditionally short-lived cytoskeletal protein and transcriptional regulator. The Leu-β-catenin fragment is a nuclear protein that activates specific genes in conjunction with other transcription factors.163 #23. Arg-IGFBP2 is the Ct fragment of the insulin-like growth factor binding protein-2 that is cleaved by calpain-2 at least in vitro.164 #24. Glu-Iκ-Bα is the Ct fragment of the Iκ-Bα subunit of the autoinhibited NF-κB–Iκ-Bα complex in which the NF-κB transcriptional regulator is inhibited by Iκ-Bα. The Iκ-Bα subunit is targeted for degradation either through a conditional phosphodegron or through the calpain-mediated cleavage165 which produces Glu-Iκ-Bα, predicted to be an Arg/N-end rule substrate. #25. Phe-PKCγ is the Ct fragment of PKCγ, a Ser/Thr kinase of the PKC family.166 The Phe-PKCγ fragment is constitutively active as a kinase, because it lacks the regulatory Nt domain of the full-length PKCγ kinase.166 #26. Leu-CAMK-IV is the Ct fragment of the Ca2+/calmodulin-dependent kinase-IV. This fragment lacks kinase activity.167 #27. Lys-PKCα is the Ct fragment of PKCα, a broadly expressed Ser/Thr kinase of the PKC family.166 Being catalytically active but no longer controlled by the regulatory Nt domain of the full-length PKCα, the Lys-PKCα fragment can be toxic, for example, upon its formation in an ischemic heart.168 #28. Arg-p39 is the Ct fragment of the p39 activator of the Cdk5 protein kinase.169 The indicated cleavage site is located immediately downstream of two other closely spaced (and strongly conserved) calpain cleavage sites in p39. A cleavage at any one of these sites yields a predicted Arg/N-end rule substrate [see Fig. 6(H)]. #29. Lys-GAD65-2 is the Ct fragment of the glutamic acid decarboxylase-65-2 (GAD65-2), which is bound to membranes of synaptic vesicles and mediates the synthesis of the inhibitory neurotransmitter γ-aminobutyric acid (GABA) from the excitatory neurotransmitter glutamate. Calpain-generated Lys-GAD65-2 retains the enzymatic activity of uncleaved GAD65-2 but is no longer associated with synaptic vesicles170–172 [see also Fig. 6(G)]. #30. Arg-caspase-9 is the Ct fragment of caspase-9, which can be inactivated by calpains,173 followed by the (predicted) degradation of the Arg-caspase-9 fragment by the Arg/N-end rule pathway. #31. Leu-calpain-1 is the Ct fragment of human calpain-1. The Leu-calpain-1 fragment is an activated form of this calpain.174, 175 #32. Asp-calpain (reg. subunit) is the Ct fragment of the calpain regulatory subunit that is cleaved by activated calpains.116, 120 #33. Lys-calpain-2 is the Ct fragment of human calpain-2, an activated form of this calpain.116–120 #34. Asn-calpain-B is the calpain-generated Ct fragment of one of two major D. melanogaster calpains and an activated form of this calpain.176 [Color figure can be viewed in the online issue, which is available at]

Although this article considers largely intracellular protein fragments, essentially the same logic and analogous physiological ramifications also apply to extracellular protein fragments that are secreted from cells or produced from extracellular proteins (including extracellular domains of transmembrane proteins) by proteases in the extracellular space.

In comparison with sleep, the state of wakefulness entails, on average, significantly higher Ca2+ influxes into neurons. These transient fluxes (Ca2+ transients) are mediated by Ca2+ channels.96–98 Activation of specific transmembrane channels increases the levels of Ca2+ in the nucleus and the cytosol, including the cytosol's dendritic and axonal spaces.99–104 Ca2+ channels reside in the neuron's plasma membrane, particularly its synaptic regions, and also in intracellular membranes of the endoplasmic reticulum (ER), which is present in axons and dendrites as well.96, 97 The ER lumen is one site of Ca2+ storage, in addition to extracellular space. The plumes (microdomains105, 106) of Ca2+ ions entering specific regions of the cytosol are transient under normal conditions, owing to the activity of transmembrane pumps/exchangers that return Ca2+ to the extracellular space and the ER. Despite the transiency of Ca2+ microdomains and despite the buffering of Ca2+ by Ca2+-binding proteins such as calmodulin,107 increased concentrations of Ca2+ inside microdomains suffice for a local activation of calpains. While the cleavages of proteins by calpains are the chief proteolytic consequence of Ca2+ transients, these fluxes are also capable of activating some secretases,108–110 and caspases as well, in part through activation of the apoptosome and inflammasome.111, 112 Many, possibly most neurons contain activated caspases such as caspase-3 (an effector caspase) under conditions in which caspase activation and caspase-mediated protein cuts are relatively constrained and do not cause cell death. The resulting protein fragments can be involved in neuronal remodeling, including changes that underlie memory.113–115 As described below, these fragments may also contribute to the causation of sleep.

An individual mammalian genome encodes ∼15 distinct calpains, including two major ones, calpain-1 (μ-calpain) and calpain-2 (m-calpain). Both neurons and glial cells contain these and other calpains. To simplify discussion, only Ca2+ transients and the resulting local increases of Ca2+ are cited below as the cause of calpain activation. Note, however, that other calpain regulators, such as phospholipids, protein kinases, calpain-binding proteins, and the ability of some calpains to activate themselves and other calpains through specific cuts, are also involved in controlling calpains,116–120 thereby shaping the in vivo repertoires of protein fragments generated by these proteases.

Production of protein fragments takes place on both presynaptic and postsynaptic sides of active synapses, throughout axons and dendrites, in the neuronal soma, in glial cells, and in most other cells of an organism. Moreover, the maintenance of brain anatomy, including the suppression of sprouting in a stable neurite shaft, has been shown to be an active process mediated by the proteolytic activity of calpains (particularly calpain-2) throughout dendrites and axons, even in the absence of action potentials and accompanying Ca2+ transients.121 In other words, the preservation of normal morphologies and connections throughout the nervous system involves a constant generation of protein fragments, above and beyond their production during Ca2+ transients at active synapses.

Subunits of transmembrane channels and many other proteins are cleaved by Ca2+-activated calpains and by other nonprocessive proteases. The resulting C-terminal fragments often have N-terminal residues that are “destabilizing” in that they can be recognized by the Arg/N-end rule pathway, a processive proteolytic system that targets proteins bearing specific N-terminal residues122, 123 [Figs. 1 and 2(A)]. (Ramifications of this fact for the in vivo dynamics of protein fragments are discussed below.) Activated calpains have been shown to cleave approximately 100 different proteins, most of which are present in neurons (Fig. 3). Because calpain substrates were identified by nonsystematic means, the number of different proteins in mammalian neurons that are cleaved by calpains is certain to be significantly larger than 100. The known substrates of activated mammalian caspases are nearly 1000 different proteins177–180 (Fig. 4). Other proteases, including secretases, separases, and aminopeptidases, also produce intracellular protein fragments, including fragments that bear destabilizing N-terminal residues (Fig. 5). In addition, proteins that reside in or move through secretory compartments such as the ER can be cleaved by compartment-specific proteases. Most fragments generated by nonprocessive proteases are protein-size molecules (Figs. 3–6). They are degraded processively to short peptides at varying rates, often after delays, by specific branches of the N-end rule pathway (Figs. 1 and 2), by other pathways of the ubiquitin system, and through the autophagy-mediated lysosomal proteolysis as well.

Figure 4.

Caspase-generated C-terminal fragments of mammalian proteins that are either identified or predicted substrates of the Arg/N-end rule pathway. The 30 caspase substrates cited in this list are a small fraction of ∼1000 known and mapped mammalian caspase substrates (see the main text). For designations, including residue numbering, see the legend of Fig. 3. The first eight caspase-generated, proapoptotic Ct fragments are the recently examined and confirmed short-lived substrates of the Arg/N-end rule pathway.123 #1. Cys-RIPK1 is the proapoptotic Ct fragment of the RIPK1 kinase, a regulator of apoptosis, necroptosis, and other processes, including antiviral responses that do not involve cell death178, 181–183 [see also Fig. 6(A)]. #2. Cys-TRAF1 is the proapoptotic Ct fragment of TRAF1, which functions to minimize the activation of caspase-8 and other proapoptotic reactions, in part by contributing to upregulation of the antiapoptotic NF-κB regulon.178, 184 #3. Asp-BRCA1 is the proapoptotic Ct fragment of BRCA1, a RING-type E3 ubiquitin ligase that functions as a tumor suppressor and participates in DNA repair, cell-cycle regulation, transcriptional control, and other processes185, 186 [see also Fig. 6(B)]. #4. Leu-LIMK1 is the proapoptotic Ct fragment of LIMK1, a Ser/Thr kinase that functions, in particular, as a downstream effector of Rho signaling pathways and regulator of actin dynamic.187 #5. Tyr-NEDD9 is the proapoptotic Ct fragment of NEDD9, a scaffolding protein whose functions include cell attachment, migration, and mitotic control.188 #6. Arg-BIMEL is the proapoptotic Ct fragment of BIMEL a regulator of apoptosis.178 #7. Asp-EPHA4 is the proapoptotic Ct fragment of EPHA4, a member of the family of more than 10 mammalian “dependence” receptors (DpRs). These structurally distinct receptors are functionally analogous because of their ability to mediate two opposite physiological outcomes. In the presence of its cognate ligand, a DpR receptor activates signaling pathways that mediate cell survival, migration, proliferation, or differentiation. In the absence of its ligand, a dependence receptor produces a proapoptotic signal, often through the formation, by caspases or other nonprocessive proteases, of a C-terminal proapoptotic fragment that functions in the cytosol and/or the nucleus.189, 190 #8. Tyr-MET is the proapoptotic Ct fragment of MET, another dependence receptor, with functions in embryonic development and organ formation.189, 190 The next six caspase-generated proapoptotic Ct fragments are predicted substrates of the Arg/N-end rule pathway.123 #9. Asn-PKCδ is the proapoptotic Ct fragment of the PKCδ protein kinase.191 #10. Lys-PKCδ is the proapoptotic Ct fragment of the PKCδ protein kinase.192 #11. Trp-ETK is the proapoptotic Ct fragment of the ETK/BMC tyrosine kinase, a member of the Btk/Tek kinase family.193 #12. Gln-SLK is the proapoptotic Ct fragment of SLK, a STE20-related protein kinase that plays a role in regulation of actin fibers.194 #13. Ile-HPK1 is the proapoptotic Ct fragment of HPK1, the a STE20-related protein kinase whose functions include stimulation of the stress-activated protein kinases SAPKs/JNKs and the NF-κB transcriptional regulon.195 #14. Ile-MLH1 is the proapoptotic Ct fragment of the MLH1 DNA mismatch repair protein.196 The next 16 caspase-generated Ct fragments and predicted substrates of the Arg/N-end rule pathway that are not necessarily proapoptotic. These likely Arg/N-end rule substrates are cited to illustrate the remarkable diversity of their precursor proteins. #15. Tyr-CYLD is the Ct fragment of a deubiquitylase that regulates apoptosis and necroptosis197 [see also Fig. 6(D)[. #16. Leu-p21Cip1/Waf1 is the Ct fragment of p21Cip1/Waf, an inhibitor of cell division.198 #17. Arg-IP3R is the Ct fragment of the inositol 1,4,5-triphosphate receptor.199 #18. Asn-LMN1 is the Ct fragment of lamin-A, a component of nuclear lamina.200 #19. Arg-ETS-1 is the Ct fragment of a transcription factor.201 #20. Tyr-TOP1 is the Ct fragment of type I DNA topoisomerase.202 #21. Leu-MEFD2 is the Ct fragment of a transcription factor.203 #22. Asn-DNA-PK is the Ct fragment of the DNA-dependent protein kinase.204 #23. Asn-CAD1 is the Ct fragment of E-cadherin, an adhesion receptor.205 #24. Gln-synphilin-1 is the Ct fragment of synphilin-1, a ligand of α-synuclein206 [see also Fig. 6(C)]. #25. Tyr-ACINUS is the Ct fragment of a mediator of apoptotic chromatin condensation.207 #26. Lys-PLECTIN is the Ct fragment of a cytoskeletal protein.208 #27. Cys-CCNE1 is the Ct fragment of a specific G1/S cyclin.209 #28. His-PMCA4b is the Ct fragment of a Ca2+ extrusion pump.210 #29. Asp-CDC42 is the Ct fragment of CDC42, a RAS superfamily member.211 #30. Tyr-iPLA2 is the Ct fragment of the phospholipase A2 (Ref. 212). [Color figure can be viewed in the online issue, which is available at]

Figure 5.

Examples of predicted substrates of the Arg/N-end rule pathway that are produced by nonprocessive proteases other than Met-aminopeptidases, calpains, or caspases. For designations, see the legend of Fig. 3. Although some of the proteases mentioned below (e.g., furin) are usually localized outside the nucleus and cytosol, the cited articles indicate that these proteases can also generate cytosolic or nuclear Ct protein fragments. #1. Furin-mediated cleavage of the bacterial (Bortedella) DNT toxin produces the Glu-DNT fragment, which translocates into the cytosol.213 #2. Cleavage by γ-secretase generates the Arg-CAD1 fragment of E-cadherin.205 #3. Another predicted Arg/N-end rule substrate that can be produced by γ-secretase is Gln-ROBO1, the Ct fragment of the transmembrane ROBO1 receptor.214 #4. Proteinase-3 (myeloblastin), a cytosolic/nuclear protease, produces Arg-p21, the Ct fragment of p21, a cell division inhibitor.215 #5. The Omi/Htr2 protease can cleave the cIAP1 protein (an antiapoptotic regulator and inhibitor of caspases), yielding the Asn-cIAP1 Ct fragment.216 #6. An intracellular form of elastase cleaves the PML-RARα oncoprotein fusion, yielding the Tyr-PML-RARα Ct fragment.217 #7. The aminopeptidase PILSAP removes the first nine residues of the kinase PDK1, yielding the enzymatically active Asp-PDK1 Ct fragment.218 [Color figure can be viewed in the online issue, which is available at]

Figure 6.

Evolutionary conservation of caspase (A–D) and calpain (E–H) cleavage sites, and the conservation of destabilizing activity of P1′ residues in these sites. The caspase cleavage sites in A–D (positions P4–P1) are framed by gray rectangles. The indicated residue numbers, including those of P1′ residues (in color), are of human versions of the full-length proteins. In many caspase and calpain cleavage sites (including those shown in Figs. 3 and 4), P1′ residues are completely conserved at least among vertebrates. By contrast, in some cleavage sites, for example those of BRCA1 in B (Fig. 4 (#3)), synphilin-1 in C (Fig. 4(#24)), CYLD in D (Fig. 4(#15)), and p39 in H (Fig. 3(#28)), the P1′ residues (they are shown in different colors, with predominant identities in red) are not conserved among vertebrates. Remarkably, however, their destabilizing activity in the Arg/N-end rule pathway [Fig. 2(A)] is invariably conserved (see the main text). For brief descriptions of the other cited caspase- or calpain-cleaved proteins RIPK1, NF2, RAD21, and GAD65, see Fig. 4(#1), Fig. 3(#13), Fig. 3(#8), and Fig. 3(#29), respectively. See also the Protein Fragments, Their Generation In Spite of Deleterious Effects, and the Evolutionary Persistence of Sleep section. [Color figure can be viewed in the online issue, which is available at]

In the proposed concept, termed the fragment generation (FG) hypothesis, sleep is a state that originates in individual neurons and involves a decreased production of protein fragments (owing in part to lower Ca2+ transients) as well as increased activities of fragment-destroying proteolytic pathways. These changes would facilitate the elimination of fragments and the remodeling of protein complexes in which the fragments were produced and transiently resided. The FG hypothesis leads to mechanistic explanations of specific properties of sleep, including its evolutionary persistence and homeostatic character (a longer sleep after sleep deprivation), as well as a broad variation of sleep duration and other features among different species.


If the increased generation of protein fragments during wakefulness and specific cognition-perturbing effects of this process underlie the causation of sleep, the individual neurons and circuits that regulate sleep must be able to “sense” either changes in the concentrations of fragments or consequences of these changes. One idea is that some proteins have acquired, during evolution, specific cleavage sites for yielding protease-generated fragments that function as somnogenic proteins. Such sensor-effector proteins and their fragments would allow systems that regulate sleep to gauge the increased production of many different fragments during wakefulness and to react by increasing the levels of somnogenic compounds, for example, extracellular adenosine, ATP or PGD2. These compounds, acting autocrinally and paracrinally (on cells that release them and on nearby cells), would promote a switch to a distinct state, sleep, whose salient features, according to the FG hypothesis, include lower Ca2+ transients, a slower production of fragments, and upregulated pathways that destroy fragments and remodel protein complexes in which the fragments resided.

PANX1, a transmembrane protein and a conditional ATP release channel, is one plausible example of what the postulated sensor-effector protein may be, in the framework of the FG hypothesis. The level of extracellular ATP (a somnogenic compound; see Introduction) could be upregulated, for example, through the activity of the caspase-generated fragment of PANX1 (Fig. 7). Specifically, the ability of PANX1 to function as an ATP release channel has been shown to be greatly increased by its caspase-mediated truncation.219, 220 As mentioned above, caspases, including effector caspases, can be active in nonapoptotic neurons.113–115, 221, 222 It is unknown, at present, whether activated calpains can also cleave and activate PANX1 and/or several other, related transmembrane proteins that function as ATP release channels. For a further discussion of connections between ATP/cytokine pathways and protein fragments, see the FG Hypothesis and the Somnogenic Activity of Cytokines section below.

Figure 7.

Monitoring protein fragments and reacting to their accumulation. This diagram mentions PANX1, a transmembrane protein and conditional ATP release channel (light blue oval near the top on the left) as a possible example of a sensor-effector protein. A cleavage of PANX1 is depicted to induce its activity as an ATP release channel (red semi-oval near the top on the right). By linking fragment formation to upregulation of somnogenic pathways that involve extracellular ATP and cytokines (see the FG Hypothesis and Somnogenic Activity of Cytokines section), this arrangement would make it possible for sleep-regulating pathways to gauge an increased production of many different fragments (green ovals and rectangles) during wakefulness, and to react by increasing the levels of somnogenic compounds, for example, extracellular adenosine, ATP or PGD2. The PANX1 channel was used to illustrate this regulatory arrangement because PANX1 has been shown to be strongly activated upon its cleavage by caspases.219, 220 See also the Monitoring Protein Fragments and Reacting to Their Accumulation section. [Color figure can be viewed in the online issue, which is available at]

Another possible mechanism for coupling a rising concentration of fragments to the activity of sleep-regulating circuits involves E3 ubiquitin ligases of the Arg/N-end rule pathway, called N-recognins. They bind to destabilizing N-terminal residues of protein fragments [Fig. 2(A)]. A fragment of a cleaved subunit of a protein complex may remain associated with the rest of a complex, owing to partially retained interactions within the complex. These interactions may include an association between the N-terminal and C-terminal fragments of a cleaved, complex-bound protein subunit. Examples of nondissociating or slowly dissociating cleaved protein complexes, including transmembrane channels, specific enzymes, and transcriptional regulators, are described in Refs. 143, 223–226. The ability of the Arg/N-end rule pathway to destroy a C-terminal fragment that is embedded in a protein complex would be constrained by the fragment's noncovalent interactions within the complex. If so, a pool of slowly dissociating fragments that bear sterically accessible destabilizing N-terminal residues may act as a physical sink for N-recognins, thereby decreasing the ability of the Arg/N-end rule pathway to target other Arg/N-end rule substrates.

Now suppose that a positive regulator of somnogens is a naturally short-lived protein substrate of the Arg/N-end rule pathway. If so, the rise of protease-generated fragments that can interact with N-recognins but cannot be eliminated by them efficaciously enough would titrate available N-recognins. As a result, the normally short-lived regulator would be partially stabilized, increasing in steady-state levels and thereby upregulating the levels of a cognate somnogen as well. Given large distances between the soma and distal regions of axons and dendrites, a transient inhibition of the Arg/N-end rule pathway by this mechanism may occur locally rather than cell-wide. NTAN1 mRNA, which encodes NtN-amidase, a component of the Arg/N-end rule pathway [Fig. 2(A)], has been detected in axons,127 suggesting that both NtN-amidase and other components of the pathway can be produced not only in the neuronal soma but in axons and dendrites as well, through spatially localized translation of specific mRNAs.

A transient trapping/inhibition of specific ubiquitin ligases by protein fragments that are embedded in protein complexes may involve other proteolytic pathways as well, for instance those that can target N-terminal (as distinguished from C-terminal) protein fragments. One example of such systems is the Ac/N-end rule pathway, which targets Nα-terminally acetylated (Nt-acetylated) proteins [Fig. 2(B)]. To address experimentally these, already expected complexities, it would be essential to develop methods for measuring the in vivo activity of the Arg/N-end rule pathway and analogous proteolytic systems specifically in axons and dendrites.

There is also another, non-alternative and mechanistically feasible link through which an increase, during wakefulness, in the levels of either neuronal protein fragments or un-remodeled protein complexes (in which the fragments transiently reside) may upregulate the sleep drive. Either the known (e.g., Fig. 3) or still to be identified protease-mediated cleavages of specific transmembrane ion channels may alter properties of these channels in ways that would favor the signaling by inhibitory interneurons, as distinguished from excitatory neurons. Because small numbers of inhibitory interneurons tend to control significantly larger numbers of excitatory neurons, the resulting alteration (mediated by cleaved ion channels) of neuronal firing patterns in sleep-regulating circuits would favor the transition to sleep.


The FG hypothesis posits that proteolytic cleavages and the resultant protein fragments have both deleterious effects and adaptive (fitness-increasing) functions. For example, a specific cleavage-generated protein fragment can be deleterious while the other fragment, produced by the same cut, can have an adaptive function. Thus, remarkably, a cleavage of a protein may have both adaptive and harmful consequences at the same time. This is an important but previously not considered feature of in vivo proteolytic cuts. This property of cleavages, in addition to other dichotomies that involve protein fragments, can account for the observed evolutionary stability of specific cleavage sites in proteins (Fig. 6), because at least one of two fragments formed by a cleavage would have an adaptive function.

For example, calpain-mediated cleavage of a protein may act as a timing device that “tags” a transient event such as a Ca2+ microdomain by generating a cleaved protein complex that may be spatially confined to a specific dendritic spine. Such events were proposed to involve spectrin as a cytoskeletal protein whose cleavage by calpains at active synapses might contribute to memory processes.228 Many other proteins can also be cleaved by activated calpains (Fig. 3). Some of these cleavages and the resulting protein fragments may be a part of memory processes. According to the FG hypothesis, deleterious aspects of these and other protein fragments underlie the causation of sleep.

The necessity of apoptosis and other forms of programmed cell death in multicellular eukaryotes is one reason for the acquisition and retention of caspase cleavage sites (and of some calpain cleavage sites) in proteins during evolution. An example of fitness costs imposed by calpain cleavage sites are excessive calpain-mediated cleavages of neuronal proteins under conditions of excitotoxicity, for example, upon sustained increases of extracellular glutamate (which induces Ca2+ transients) in brain regions affected by stroke. Note, however, that similar cleavages of neuronal and glial intracellular proteins take place during normal wakefulness as well (Fig. 3), but at frequencies that are too low for irreversible cytotoxic effects. Nevertheless, these lower frequencies of cleavages are still high enough, according to the FG hypothesis, for a gradual perturbation of cognition and for eliciting responses by circuits that regulate sleep. With most calpain substrates cited in Figure 3, a rough estimate is that less than 20% of a substrate is cleaved at steady-state during normal wakefulness. Given low percentages of cleaved protein complexes, their (postulated) ability to perturb cognition may stem from gain-of-function effects, either by an initially cleaved complex that retains both products of a cut, or by an unremodeled complex, after the elimination of at least one fragment.

A protein fragment may be deleterious by not dissociating rapidly enough from a protein complex in which the fragment had been produced, or by acting as a perturbant in its free (dissociated) state. For example, a free fragment may form toxic oligomers or larger aggregates. A deleterious effect of a fragment may be accompanied by an independently present adaptive function of the same fragment, for example, by its necessity for a role that cannot be performed by the full-length precursor protein. Another possibility, mentioned above, is that one of two cleavage-generated protein fragments is deleterious while the other one has an adaptive function.

If a maladaptive aspect of a protein cleavage is not accompanied by a different, fitness-increasing effect of this event, there would be a selection pressure to eliminate the cleavage site. Calpain cleavage sites encompass ∼ 10 residues, involve both conformational and sequence determinants, and can be impaired or inactivated by missense mutations.116–120, 229 Caspase cleavage sites are simpler and can be readily inactivated by missense mutations.179, 230 Given the ease of elimination, on evolutionary timescales, of deleterious cleavage sites in neuronal and other proteins, it is striking, and most telling, that both calpain and caspase cleavage sites are highly conserved at least among vertebrate proteins (Fig. 6).

Also telling, in a different and independent way, is the evolutionary pattern of P1′ residues in these cleavage sites (P1′ residues become N-terminal after a cleavage). Specifically, the exact identity of a P1′ residue is not always conserved during evolution. Remarkably, however, the destabilizing nature of a P1′ residue, i.e., its ability, once it becomes N-terminal, to be recognized by the Arg/N-end rule pathway [Fig. 2(A)], is virtually always conserved (Fig. 6). This evolutionary pattern of P1′ residues in the precursors of natural protein fragments strongly suggests that the ability of C-terminal fragments to be targeted as substrates of the Arg/N-end rule pathway is a positively selected property of these fragments.

A caveat to this view of evolution of the P1′ residues is that their conservation in a specific cleavage site often occurs in a sequence context that is also conserved, at least among vertebrates [Fig. 6(A,E,F), and data not shown]. An alternative interpretation, in such cases, is that the observed conservation of a P1′ residue stems from a low tolerance for sequence alterations throughout the cleavage site (e.g., because this site is a critical part of a structure required for a protein's function), as distinguished from selection that acts to retain destabilizing activity of a P1′ residue. In such cases (and also, strictly speaking, in all cases that involve P1′ residues), the hypothesis of positive selection for destabilizing P1′ residues must be verified by causative (genetic) tests that examine fitness changes of organisms in which a wild-type destabilizing P1′ residue in a precursor of C-terminal fragment has been replaced by a residue that is not recognized by the Arg/N-end rule pathway [Fig. 2(A)].

Based on this experimental background (Figs. 1–6), the FG hypothesis posits a specific explanation for the observed evolutionary persistence of sleep, despite its maladaptive properties cited above. Specifically, the retention, through positive selection, of cleavage sites in proteins and the resulting evolutionary persistence of protease-generated fragments compel the existence of sleep as well, because sleep counteracts deleterious aspects of protein fragments. As discussed below, the FG hypothesis also leads to new explanations of the documented malleability of sleep, including its striking variations among different animals. In the framework of this hypothesis, the rates of generation of specific protein fragments by nonprocessive proteases, the rates of subsequent degradation of fragments to short peptides, and the rates of remodeling of protein complexes in which these fragments initially resided (Fig. 1) are independently tunable parameters that can be altered in individual neurons and other cells.

Consequently, variations in the rates of initial protein cleavages, in the rates of fragments' processive degradation, and in the rates of subsequent protein remodeling that would be favored by the ecology of an organismal lineage can be selected and optimized on evolutionary timescales. The results of such adjustments may underlie and explain many differences in homeostatic aspects and other properties of sleep among different animal species. For example, the duration of sleep in mammals varies from ∼4 hr in mountain goats to ∼16 hr in hedgehogs. The fact that sleep is universal at least among vertebrates and takes hours rather than minutes per day even among shortest-sleeping animals implies that a massive decrease or elimination of sleep during evolution was impossible cost-wise. According to the FG hypothesis, the cost of sleep reduction stems from the necessity of counteracting, through sleep, specific deleterious aspects of (positively selected) protein fragments. If so, one task at hand is to identify protein fragments that are particularly relevant to sleep causation, as well as specific mechanisms of fragment-mediated deleterious effects that occur in the absence of sleep.


Proapoptotic protein fragments

A recent study by our laboratory described the discovery that the Arg/N-end rule pathway is a repressor of apoptosis, through the ability of this pathway to destroy caspase- or calpain-generated proapoptotic protein fragments.123 Such fragments are defined, operationally, as those that increase the probability of apoptosis compared with full-length precursors of fragments. Metabolic stabilization of the proapoptotic, normally short-lived Cys-RIPK1 fragment of the RIPK1 kinase [Fig. 4(#1) and Fig. 6(A)] through the mutational conversion of its destabilizing N-terminal Cys to Val (which is not recognized by the Arg/N-end rule pathway) was found to elevate the fragment's in vivo levels and thereby to greatly increase its (intrinsic) proapoptotic activity.123 Abnormally high proapoptotic activity of a naturally produced proapoptotic fragment such as Cys-RIPK1 would be expected to be maladaptive, strongly suggesting (though not proving) that the presence and evolutionary conservation of a destabilizing N-terminal residue in Cys-RIPK1 [Fig. 4(#1) and Fig. 6(A)] stems from a positive selection for such a residue.

Ten proapoptotic fragments characterized as short-lived N-end rule substrates in the cited study123 [Fig. 3(#16, #17) and Fig. 4(#1–8)] are likely to be produced, at low but nonzero yields, in nonapoptotic cells as well. It remains to be determined whether any one among these C-terminal fragments (and/or their N-terminal counterparts) can be beneficial under conditions distinct from apoptosis. These 10 fragments (at least 20 with N-terminal fragments as well) are the tip of the iceberg vis-à-vis the total number of different fragments (possibly hundreds of them) that are likely to be relevant to the FG hypothesis (Figs. 3–6).

Effects of inflammatory cytokines such as TNF-α include the activation of caspases in target cells (Ref. 123 and references therein). I suggest that the somnogenic activity of TNF-α and analogous cytokines (see Introduction) is mediated in part by activation of caspases in neurons. This activation would yield a variety of protein fragments, including the previously characterized proapoptotic fragments that are targeted for degradation by the Arg/N-end rule pathway123 [Fig. 4(#1–8)]. In this view, increasing concentrations of extracellular TNF-α and other somnogenic cytokines in the brain during wakefulness generate activated caspases and specific intracellular protein fragments at levels that do not cause apoptosis but result, instead, in higher probabilities of awake-to-sleep transitions in individual neurons and neuronal networks (see the FG Hypothesis and the Somnogenic Activity of Cytokines section below).

A fragment of cohesin

Another example of functional dichotomies in proteolytic cuts is a subunit of cohesin.148 Cohesin rings encircle two replicated daughter chromatids and thereby maintain, through a topological confinement, the cohesion of chromatids until mitosis.231–233 An endoprotease called separase is activated during mitosis and cleaves the Scc1/Rad21 (kleisin) subunit, thereby opening the ring of cohesin and making chromatid separation possible. In the yeast Saccharomyces cerevisiae, the C-terminal fragment of separase-cleaved Scc1/Rad21 bears N-terminal Arg, a destabilizing residue [Fig. 2(A)]. This fragment, which remains associated with cohesin complex, is dislodged and degraded by the Arg/N-end rule pathway.148 The rest of cohesin complex is apparently spared, owing to the subunit selectivity of the Arg/N-end rule pathway.122, 234–236

The identity of a residue at the N-terminus of a C-terminal fragment of Scc1/Rad21 varies in a tellingly constrained manner. Specifically, this residue is Arg in the budding yeast S. cerevisiae, Asn in the fission yeast Schizosaccharomyces pombe, Cys in D. melanogaster, and Glu in mammals. All these N-terminal residues are destabilizing in the Arg/N-end rule pathway148, 149 [Fig. 2(A)]. This evolutionary constraint indicates the necessity of “proactive” elimination of the C-terminal fragment of Scc1/Rad21. The removal of this fragment leads to the eventual reconstitution (remodeling) of cohesin through incorporation of the uncleaved Scc1/Rad21 subunit.148

In ubr1Δ S. cerevisiae, which lack the Arg/N-end rule pathway, the normally short-lived C-terminal fragment of Scc1/Rad21 is a long-lived protein.148 In this setting, the C-terminal fragment can apparently still dissociate (or is displaced) from the rest of cleaved oligomeric cohesin in vivo at a rate that is compatible with the functioning of cohesin pathways. However, this degradation-independent route is inefficient enough to cause a striking (∼1000-fold) increase in the frequency of chromosome loss in ubr1Δ cells.148 Similar genomic instability is observed in ate1Δ mouse cells, which lack R-transferase and therefore lack the arginylation branch of the Arg/N-end rule pathway [Fig. 2(A)]. R-transferase is required for the degradation of the C-terminal fragment of the mammalian Scc1/Rad21 subunit, as this fragment bears N-terminal Glu, a secondary destabilizing residue122, 149 [Fig. 2(A)] (Zhou et al., unpublished data).

Viewed together, the separase-mediated cleavage of Scc1/Rad21, the necessity of this cleavage for chromosome segregation, and the overtly deleterious effect of the resulting (cohesin-bound) C-terminal Scc1/Rad21 fragment in the absence of the Arg/N-end rule pathway122, 148 are an illustration of some concepts that underlie the FG hypothesis.

The amyloid-β (Aβ) protein fragment

Remarkably, an example of a deleterious protein fragment that accumulates in the brain, is down-regulated during sleep, and is augmented during sleep deprivation, is already known.237 The 42-residue amyloid-β (Aβ) protein fragment is produced through cleavages, by proteases of the secretase family, from the transmembrane amyloid precursor protein (APP). Upon its formation at the cell's plasma membrane, Aβ becomes largely extracellular but can also be found in the cytoplasm. Aβ forms toxic oligomers and larger aggregates. Its accumulation in the brain is among the causes of Alzheimer's disease. At the same time, significantly lower concentrations of Aβ might have beneficial effects. For example, Aβ may function as an antimicrobial peptide.238 The largely extracellular localization of Aβ is consistent with the FG hypothesis, because essentially the same logic and analogous physiological ramifications would apply to cognition-perturbing intracellular and extracellular protein fragments. It should be noted that the study that described the circadian and sleep-deprivation pattern of Aβ expression237 did not consider the possibility, proposed by the FG hypothesis, that such a pattern may be a manifestation of the far more general phenomenon in which a multitude of different protein fragments are generated in the brain during wakefulness and are, collectively, the molecular cause of sleep.


More than 50% of the previously mapped C-terminal fragments of calpain-cleaved proteins bear N-terminal residues that can be targeted by the Arg/N-end rule pathway (Fig. 3). The N-end rule relates the in vivo half-life of a protein to the identity of its N-terminal residue. Regulated degradation of specific proteins by the N-end rule pathway mediates a multitude of biological functions (Fig. 2 and its legend). The N-end rule pathway polyubiquitylates proteins that contain specific degradation signals (degrons), either internal or N-terminal (N-degrons). The main determinant of an N-degron is a destabilizing N-terminal residue of a protein (Fig. 2). The N-end rule pathway consists of two branches, the Ac/N-end rule and the Arg/N-end rule pathways (for reviews, see Refs. 122, 124, 128, 129, 239). The Ac/N-end rule pathway, described below, recognizes proteins with Nt-acetylated N-terminal residues.122, 125 The Arg/N-end rule pathway targets specific unacetylated N-terminal residues. The primary destabilizing N-terminal residues Arg, Lys, His, Leu, Phe, Tyr, Trp, and Ile are directly recognized by N-recognins (E3 ubiquitin ligases of the N-end rule pathway), whereas N-terminal Asp, Glu, Asn, Gln, and Cys function as destabilizing residues through their preliminary enzymatic modifications [Figs. 1 and 2(A)].

Altogether, 71 C-terminal fragments of specific cleaved proteins that are either identified or predicted Arg/N-end rule substrates are described in Figures 3–6. This list, based on extensive literature searches, contains a majority of mapped calpain substrates (34 proteins; Fig. 3); a small subset of mapped caspase substrates (30 proteins; Fig. 4); and a few examples of proteins that are cleaved in vivo by other nonprocessive proteases (Fig. 5). The total number of fragments is more than twice the cited number, given the production of N-terminal/C-terminal pairs of fragments and frequent multiple cleavages of a full-length protein. Among the proteins that are cleaved by calpains and yield predicted Arg/N-end rule substrates are ligand- or voltage-regulated transmembrane proteins that mediate or control Ca2+ transients [Fig. 3(#1–4, #6)]; (#6)]; specific enzymes [Fig. 3(#25, #29)]; transcriptional regulators [Fig. 3(#19)]; and many other proteins (Fig. 3).

Although the discussion was confined, thus far, to C-terminal fragments of protein precursors, many N-terminal fragments are also likely to be N-end rule substrates, owing to the Ac/N-end rule pathway, the second branch of the N-end rule pathway [Fig. 2(B)]. More than 80% of human proteins are Nt-acetylated as nascent polypeptide chains by ribosome-associated Nt-acetylases.240 In 2010, it was discovered that the Nt-acetylated residues of cellular proteins act as specific degradation signals, termed Ac/N-degrons, to distinguish them from N-degrons of the Arg/N-end rule pathway, which recognizes specific unacetylated N-terminal residues125 (Fig. 2). The new field of the Ac/N-end rule pathway is just beginning to be explored.

As suggested previously,125 a key feature of Ac/N-degrons is their conditionality, in that usually a protein would not be targeted for degradation via its Ac/N-degron, save for a brief interval between the cotranslational formation of an Ac/N-degron in a nascent protein and the subsequent incorporation (sequestration) of this protein in a cognate oligomeric complex. What might be the consequence of a proteolytic cut in a protein subunit for the metabolic fate of a resulting N-terminal fragment? Some N-terminal fragments may be sufficiently destabilized conformationally to expose, and thereby to reactivate, their Ac/N-degrons (Fig. 1). Other N-terminal fragments of cleaved proteins may remain associated with their protein complexes and remain either inaccessible or at most intermittently (stochastically) accessible to the Ac/N-end rule pathway. In sum, the functional effects and metabolic fates of N-terminal fragments in cleaved protein complexes are at least as relevant to the FG hypothesis as the effects and fates of C-terminal fragments.

Either dissociation of fragments from protein complexes or the subsequent processive degradation of fragments may be slow enough to result in accumulation of specific fragments during wakefulness. (For many fragments, the dissociation and degradation steps would be mechanistically coupled.) According to the FG hypothesis, at least some of these fragments and/or unremodeled protein complexes in which the fragments transiently resided can perturb cognition. As mentioned above, the rate at which a newly generated fragment becomes susceptible to the targeting and processive degradation may be limited by the rate of its dissociation from a protein complex, owing to noncovalent interactions within the complex and also, for example, to an intermittent (stochastic) steric accessibility of a fragment's degradation signal.


The FG hypothesis and somnogenic activity of cytokines

Positive regulation of sleep involves both small compounds (adenosine, ATP, PGD2, nitric oxide, oleamide) and protein-size effectors, including inflammatory cytokines such as TNF-α and IL-1β, and other hormones/growth factors such as epidermal growth factor (EGF), growth hormone releasing hormone (GHRH), prolactin, nerve growth factor (NGF), and brain-derived neurotrophic factor (BDNF) (reviewed in Refs. 7, 19, 70, and 71). Inflammatory cytokines of the immune system are produced both peripherally and in the brain, and have somnogenic activity. (Hence the common observation of increased sleepiness during a bout of infectious disease.) Bacterial lipopolysaccharide (LPS) and other components of microorganisms that activate the innate immune system are also somnogenic.31, 72 LPS and analogous bacterial compounds act as somnogens owing largely to their ability to upregulate somnogenic cytokines. A prolonged sleep deprivation increases susceptibility to infection in both humans and rodents.29, 36, 73–75 In healthy humans, the levels of IL-1β in blood plasma are highest at the onset of sleep.16, 76

This and other evidence, including circadian control of the production of somnogenic cytokines, indicates their involvement in physiological sleep regulation. Activated processing and release of cytokines by microglia (brain phagocytes) is caused in part by rising levels of extracellular ATP during wakefulness. ATP is released from cells by vesicular exocytosis and through transmembrane channels (ATP release channels) such as PANX1, which can be activated by Ca2+ transients and by other effectors. Acting autocrinally and paracrinally, the extracellular ATP binds to P2-type receptors such as P2X7, which are present largely on microglial cells. Through mechanisms that remain to be clearly understood, this binding results in proteolytic processing and release of inflammatory cytokines.4, 16, 56, 67, 68 Given this understanding, how might an increased generation of protein fragments during wakefulness—the proposed molecular cause of sleep—interact with ATP/cytokine pathways?

First, as suggested above (Fig. 7) (see Monitoring Protein Fragments and Reacting to Their Accumulation section), an increased generation of protein fragments during wakefulness may be conveyed to sleep-regulating circuits through cleavages of (postulated) sensor-effector proteins. PANX1, a conditional ATP release channel, might be one such protein. PANX1 can be cleaved by caspases, a truncation that has been shown to activate PANX1 as an ATP release channel.219, 220 The proposed PANX1-mediated mechanism is but one of several possible ways to link augmented generation of protein fragments during wakefulness to the ATP/cytokine system that contributes to sleep initiation. Note that the effects of inflammatory cytokines such as TNF-α include the activation of caspases in target cells.123 As suggested above (see the Proapoptotic Protein Fragments section), the somnogenic activity of TNF-α and analogous cytokines may be mediated in part by caspases. Their activation in neurons, while remaining below apoptotic levels, results in augmented generation of specific protein fragments and therefore, according to the FG hypothesis, in higher probabilities of awake-to-sleep transitions in individual neurons and neuronal networks.

Second, a somnogenic response of neurons to cytokines may be modulated by the levels of specific protein fragments that neurons contain at the time of their exposure to cytokines. As a result, at a given concentration of extracellular cytokines some neurons may undergo awake-to-sleep transitions, whereas other nearby neurons may not. In this model, the levels of specific protein fragments in a set of neurons would be increased by their recent electrochemical activity, owing to Ca2+ transients. The resulting concentrations of fragments would determine the probability of awake-to-sleep transitions, in a neuron or a neuronal assembly, in response to a given level of extracellular cytokines. Being both the molecular cause of sleep and a function of Ca2+ transients, the generation of protein fragments can track electrochemical activity of neurons and convey the level of activity-generated stress to ATP/cytokine pathways and other somnogenic systems.

Third, it has not been considered so far that although inflammatory cytokines and other protein-size somnogens signal through distinct cell surface receptors, these diverse effectors have a feature in common: they augment, directly or indirectly, calpain-activating Ca2+ transients in both neurons and glial cells.241–248 In addition, some somnogenic effectors can activate calpains independently of Ca2+ transients. For example, EGF can activate calpain-2 by upregulating kinases that phosphorylate and thereby activate this calpain.249 I suggest that these properties of protein-size somnogens and the resulting activation of calpains and other nonprocessive proteases such as caspases, comprise a specific somnogenic mechanism. In other words, it is proposed that inflammatory cytokines and analogous extracellular effectors act as somnogens at least in part because their binding to cognate receptors accelerates calpain-mediated and caspase-mediated generation of protein fragments. As suggested above, it is the concentrations of specific protein fragments in neurons that may determine the probabilities of awake-to-sleep transitions in response to a given level of extracellular cytokines. If so, the ability of cytokines, in this model, to upregulate the very process (generation of fragments) that contributes to their somnogenic activity would result in a positive feedback loop, a verifiable possibility.

Although the overall model is sufficiently specific to allow experimental testing, it is underdetermined, for example, in regard to identities of protein fragments that are functionally critical for responses to specific cytokines or other somnogenic effectors. On the (unproven) assumption that the FG hypothesis is correct at least in outline, one can ask, for instance, whether calpain-mediated cleavages of specific Ca2+ channels [Fig. 3(#2, #4, #6)] are more important for determining the probability of awake-to-sleep transition in response to a somnogen than the cleavages of other neuronal proteins, such as those described in Figure 3. Is there, in fact, a specific subset of protease-generated fragments (and their parental protein complexes) that largely determines the sensitivity of a neuron (or a neuronal assembly) to a given somnogen? These are some of experimentally addressable questions in this domain of the FG hypothesis.

Ethanol-induced sleep and Ca2+ transients

Ethanol readily crosses the blood-brain barrier and at sufficiently high doses acts as a somnogen. The main molecular mechanism through which ethanol exerts its somnogenic effect is unclear. Although biochemical impacts of ethanol on the brain are many and varied, the available evidence indicates that one major effect of ethanol is to increase Ca2+ transients in neurons and astrocytes.250–250 Higher Ca2+ transients promote the activation of calpains and other nonprocessive proteases. Hence the prediction, by the FG hypothesis, that a major somnogenic effect of ethanol may be its ability to increase the generation of protein fragments in neurons and other brain cells. In the 1970s, several studies showed that ethanol-induced sleep in mice can be strikingly prolonged by injections of either Ca2+ alone or Ca2+ and its ionophore.253–255 (Sleep was measured by the duration of unconsciousness and the loss of righting reflex after intraperitoneal injection of ethanol.) For example, at a dose of ethanol that resulted in a sleep for ∼ 68 min, an intracerebroventricular injection of CaCl2 (20 μmoles/kg) 30 min before ethanol injection resulted in a continuous sleep for ∼ 234 min, a nearly 3.5-fold increase compared with the control injection of buffer alone.253 Similar effects were observed upon injections of a lower amount of CaCl2 in the presence of a Ca2+ ionophore.253–255 These results were never interpreted in connection to protein fragments or the function of sleep. Moreover, these studies from three decades ago seem to be no longer discussed in the current literature about mechanisms of sleep. Note, however, that these results, obtained for unrelated reasons,253–255 may be accounted for by the FG hypothesis, in that an increase in Ca2+ transients (which augment the levels of activated calpains and other nonprocessive proteases) is predicted to be somnogenic. Further exploration of ethanol/calcium-induced sleep and extensions of this approach to mutants in the mouse Arg/N-end rule pathway,123, 130 with measurements of specific protein fragments in the brain, are a promising way to explore the cause of sleep vis-á-vis the FG hypothesis.


Sleep deprivation and homeostatic aspects of sleep

The FG hypothesis can account, in general terms so far, for homeostatic aspects of sleep. Specifically, a prolongation of wakefulness, i.e., sleep deprivation would be expected to generate a larger load of undestroyed protein fragments and unremodeled protein complexes. The result, according to the FG hypothesis, would be increasingly perturbed cognition, in agreement with experimental evidence about sleep deprivation. Given higher than normal levels of protein fragments and unremodeled protein complexes by the end of sleep deprivation, adjustments of those levels during subsequent sleep would be expected to take a longer time, resulting in a longer sleep, in agreement with known effects of sleep deprivation.

Suppression of sleep and unihemispheric sleep

Described below are variations of sleep that offer additional opportunities to address predictions of the FG hypothesis. Studies of Danio rerio fishes (zebrafishes) identified a specific dichotomy in their responses to sleep deprivation. Adult zebrafishes are diurnal, being active during the day and less active at night, with prolonged episodes of sleep, defined behaviorally.12, 256, 257 If adult zebrafishes were sleep-deprived, in darkness, through repetitive electric stimulation and thereafter observed during an extended period of darkness, they exhibited prolonged sleep, as would be expected of responses to sleep deprivation in mammals. However, this compensatory prolongation of sleep after sleep deprivation was not observed if sleep-deprived zebrafishes were released into light.256 Moreover, sleep-deprived zebrafishes that were kept under constant light conditions exhibited little sleep to begin with, despite the preceding sleep deprivation in darkness.256 A similar and nearly complete suppression of sleep was also observed with zebrafishes kept under constant light for at least 3 days. Sleep reappeared and progressively increased by 8 days in the presence of light. No significant rebound sleep was observed afterwards in darkness, despite the (formally defined) sleep deprivation under constant-light conditions.256

In the framework of the FG hypothesis, these, currently unexplained, sleep patterns in zebrafishes may be accounted for by different rates of reactions (under different conditions of sleep deprivation described above) that mediate the generation or degradation of specific protein fragments and the remodeling of protein complexes that initially contained these fragments. Addressing these issues experimentally with zebrafishes can involve, at first, quantitative comparisons of the levels of protease-generated protein fragments, either of a kind described in Figures 3 to 6 or of any kind, using proteome-scale, mass spectrometry-based techniques. For example, one verifiable possibility is that long-term illumination of zebrafishes may upregulate their intracellular fragment-degrading pathways, including, possibly, the N-end rule pathway, so that specific (relevant to sleep causation) protein fragments are destroyed faster than in the dark. As a result, the levels of these fragments under constant light would be predicted to be, at least initially, lower than they would be during the diurnal pattern of alternating light and darkness, resulting in a low sleep drive in the light, despite the lack of sleep. Specific causative studies, including genetic approaches (they are feasible with zebrafishes), would be essential as well, to verify and deepen FG-based mechanistic interpretations.

Sleep in cetaceans (dolphins and whales) and in some pinnipeds (walrus, sea lions, fur seals, eared seals, and earless seals) has a property called the unihemispheric slow wave (USW) sleep. This is a form of NREM sleep in which the state of being asleep alternates between two brain hemispheres and is accompanied by only one closed eye at a time, the one that is contralateral to (and is innervated by) the sleeping hemisphere.24 The likely adaptive nature of USW sleep stems from the ability of, for example, a fur seal to monitor its immediate environment for predators and conspecifics using a single open eye, to continue to swim, and to continue staying “halfway” alert most of the time. Among terrestrial animals, only birds exhibit aspects of unihemispheric sleep, but not as pronounced as in cetaceans.24, 258 The USW sleep has evolved in marine mammals presumably because it is particularly adaptive in aquatic settings, as the terrestrial environment can provide an animal with a measure of protection during sleep, for example, in burrows and trees. In addition, vulnerable grazing ungulates, such as antelopes, tend to sleep in herds, in which there are always some animals (sentinels) that are not asleep.258 Strikingly, the NREM sleep in fur seals can be of two kinds. On land, seals exhibit, most of the time, a bilaterally symmetrical slow wave (BSW) sleep, similar to NREM sleep in terrestrial mammals. In contrast, when a fur seal sleeps in water, its EEG patterns indicate a unihemispheric slow wave (USW) sleep, as in whales and dolphins.24, 259

This switch from BSW sleep on land to USW sleep in water may be an opportunity to address a prediction of the FG hypothesis. In the BSW sleep (on land), when both hemispheres sleep at the same time, specific (e.g., relevant to sleep causation) protein fragments in the two hemispheres of the seal's brain would be expected to be present at equal levels, irrespective of the timing of brain analyses during sleep–wakefulness cycles. In contrast, if it is the destruction of protein fragments that does not keep pace with fragments' accumulation during wakefulness, the FG hypothesis predicts that during the USW sleep (in water), the sleeping brain hemisphere of a fur seal would contain lower levels of these fragments than the awake hemisphere of the same brain. This is a verifiable proposition. Its upside is that a negative result (the absence of significant differences between the levels of assayed protein fragments in a sleeping vs. awake hemisphere of a seal or a bird) would make the FG hypothesis less likely to be correct. A downside of such comparisons is their correlational (noncausative) nature.

Protein fragments and the size of a nervous system

Generation of protein fragments and the accompanying production of protein complexes that have to be remodeled after fragments' removal are cell-autonomous processes. Therefore there is no obvious lower boundary, in the current FG hypothesis, to the size of a nervous system that may exhibit a sleep-like behavior as an adaptive response to protease-generated fragments and modified protein complexes in which these fragments resided. C. elegans (∼300 neurons) has a quiescent behavioral state during a period called lethargus.260–262 It occurs before each of the four molts and exhibits some aspects of sleep, including higher arousal thresholds and a homeostatic response to mechanical stimulation during lethargus (specifically, an augmented quiescence after stimulation).260 The periods of lethargus involve alterations of the worm's nervous system, suggesting that lethargus evolved to accommodate specific requirements of such alterations.

Although it remains to be determined whether lethargus in C. elegans is fundamentally similar to sleep-like states in Drosophila, let alone to sleep in mammals, there are analogies between lethargus circuits and sleep regulation in larger animals, in that both settings involve proteins that control the epidermal growth factor (EGF) pathway, an ancient system whose functions encompass quiescence.260–264 The FG hypothesis predicts higher levels of specific protease-generated protein fragments in some or most C. elegans neurons (and possibly in other cell types as well) shortly before lethargus. If the FG hypothesis proves relevant to the causation and function of sleep in mammals, the resulting understanding could be “turned around” by asking whether the lethargus in C. elegans involves the generation and degradation of specific fragments and the remodeling of protein complexes after fragments' removal, and whether these processes are the molecular cause of lethargus.

It is also unknown whether FG concepts might be relevant to organisms such as protists or other single-cell eukaryotes, which exhibit cycles of metabolic transitions under normal growth conditions.265 Fundamentally, the FG hypothesis posits that a multitude of diverse protein fragments generated by nonprocessive proteases under conditions of daily life, as well as the necessity to remodel cleaved protein complexes can perturb the functioning of a cell. Being a challenge to stress response systems, these perturbations might require, at least in some cells, a quasi-periodic change of regimen that in more complex cellular assemblies would be called sleep. If the FG hypothesis proves germane to animals with elaborate nervous systems, this advance might also clarify whether FG concepts are also relevant to organisms where “sleep” is not an applicable term.


Although the FG hypothesis was described here in the context of sleep, the relevance of proposed mechanisms is not confined to neurons. Resetting of a circuit that involves a protease-generated protein fragment embedded in a multisubunit complex may involve the subunit-selective degradation of this fragment in any cell type and for reasons unrelated to sleep. Some of these reasons, such as the cycles of cleavage, subunit-selective degradation, and remodeling of cohesin complexes,122 were described above (see a Fragment of Cohesin section).

Viewed generally, the concepts of the FG hypothesis should be relevant to any in vivo setting that involves accumulation of either a protease-generated fragment or a cleaved, unremodeled protein complex, owing to insufficiently high rates of either degradation or remodeling, respectively. Increased levels of protein fragments and unremodeled complexes in such cases may be accompanied by an (evolved) cellular adaptation that amounts to a period of quiescence for the cell type involved. “Quiescence” is defined here as the time during which the generation of fragments slows down, the activity of fragment-destroying pathways goes up, and the remodeling of cleaved complexes can take place after the fragments' removal. Just two examples of the likely relevance of FG concepts outside sleep—muscle fatigue and mechanisms of memory—are briefly mentioned below, but many such settings may exist in all organisms, including prokaryotes, which contain both nonprocessive proteases and specific versions of the N-end rule pathway.122, 129, 266, 267

The roles of calpains in mammalian cells of different types are broadly similar to calpain functions in neurons. In myotubes of the skeletal muscle and in cardiomyocytes of the heart, calpains can cleave cytoskeletal proteins such as cortactin, vimentin, dystrophin, talin, troponin, actin, and titin [Fig. 3(#9, #10, #11, #12, #14)], and many other proteins as well. A detailed discussion of this subject is beyond the scope of the present article, but it should be mentioned that the role of calpains in the cleavage of muscle proteins (with generation of protein fragments) and processive degradation of fragments under normal conditions or upon a strenuous exercise is a significant theme in muscle research.268–270 The existing evidence is consistent with the relevance of FG concepts to muscle physiology.

Another FG-related subject that will be mentioned but not discussed in detail are mechanisms of long-term memory. The previously proposed notion that calpain-mediated cleavages of specific cytoskeletal proteins such as spectrin may be relevant to memory processes228 remains unproven but likely, especially in view of many other calpain and caspase substrates in neurons that may be components of memory circuits (Figs. 3 and 4). There is growing evidence that cleavages of specific neuronal proteins by calpains and caspases play a role in long-term memory.114, 271 Studies of Ntan1−/− mice, which lacked the Asn-specific Ntan1 NtN-amidase and therefore lacked the Asn-deamidation branch of the Arg/N-end rule pathway [Fig. 2(A)], showed that these mice had significant defects in learning and long-term memory.272 The recent discovery and molecular characterization of the Gln-specific Ntaq1 NtQ-amidase, another component of the Arg/N-end rule pathway [Fig. 2(A)], has revealed that the previously identified fly mutant termed tungus, which had impaired long-term memory,273 was apparently a null mutant in the Drosophila Ntaq1 NtQ-amidase.274 In addition, physiological substrates of calpains and caspases can yield C-terminal fragments with N-terminal Asn or Gln. Such fragments are predicted substrates of NtN-amidase and NtQ-amidase, respectively [Fig. 3(#4, #5, #6, #12, #18, #34), Fig. 4(#9, #12, #18, #22, #23, #24), and data not shown]. This fact and the above genetic results270–274 strongly suggest a role of the Arg/N-end rule pathway in memory processes.


The motivation to describe these linked ideas and their ramifications before experimental testing was fourfold. First, although the proposed model differs from previous thinking in research on sleep, the FG hypothesis is not inconsistent, to my knowledge, with any solid data about sleep. Second, in addition to possibly identifying the molecular cause and function of sleep, the FG hypothesis leads to new explanations of specific properties of sleep, including its evolutionary persistence and homeostatic character. Third, the concepts about protein fragments and their dynamics that underlie the FG hypothesis are likely to be relevant to other biological problems, including mechanisms of memory and the functioning of cells other than neurons. Fourth, the complexity of pathways encompassed by the FG hypothesis suggests that its verification won't be a short affair. Experiments to address specific predictions of this hypothesis were recently initiated in our laboratory.


I thank Christopher Brower, Raymond Deshaies, William Dunphy, Konstantin Piatkov, Jevgenij Raskatov, Connor Rosen, Brenda Schulman, Anna Shemorry, and Brandon Wadas for helpful discussions and comments on the manuscript. I am particularly grateful to Roger Kornberg and William Tansey for their detailed suggestions. Our studies of the ubiquitin system and the N-end rule pathway are supported by grants from the National Institutes of Health.