Functional structure and composition of the extracellular matrix



In this brief introductory paper the general structure and the molecular composition of the extracellular matrix are outlined. Ultrastructural morphology of the extracellular matrix is introduced and subsequently the molecular structure of each of the main protein families, which together make up the extracellular matrix, is reviewed. Collagens, laminins, tenascins, and proteoglycans are addressed. An important common feature is the domain structure of these in general very large proteins. Several families have domains in common, which favours extensive interactions. Integrins play an important role in these interactions and also in the communication between cells and the matrix. The extracellular matrix appears to be a very dynamic structure, which has a prominent role in normal development as well as in a variety of disease processes. Matrix metalloproteinases are essential actors in this complex interplay between cells and the extracellular matrix. Copyright © 2003 John Wiley & Sons, Ltd.


In the reviews that follow in this issue, in-depth discussions of the role extracellular components play in pathological processes are presented that presuppose a solid basic knowledge of what this matrix is made of. Some of this information can be extracted from the reviews themselves but for reasons of clarity we have deemed it useful to briefly discuss in this introductory paper its general structure and the main characteristics of the most important components of the matrix, and also the molecules that matrix components use to interact with cells that make the matrix or are supported by it. We have limited the references to reviews, which will allow the interested reader easy access to further details.

General structure of the extracellular matrix

The general structure of the extracellular matrix at an ultrastructural level is illustrated in Figure 1. In this electron micrograph the two main domains of the extracellular matrix are clearly identifiable: the basement membrane, a condensed matrix layer that is formed adjacent to epithelial cells, other covering cell sheets (mesothelium, meningothelium, and synovia), muscle and Schwann cells, adipocytes, and the interstitial matrix. The main characteristic these two domains have in common is that their basic structure is defined by a collagen scaffold, although the collagens that make up the scaffold are quite different, as are their three-dimensional architecture. Adhesive glycoproteins, including laminin and tenascin, and proteoglycans adhere to the scaffold and interact with the cells in or adjacent to the matrix. Interaction with these cells is conducted through matrix receptors, of which the integrins constitute the most important class. The extracellular matrix is not static: it is remodelled constantly, which implies constant breakdown by proteases, notably the family of matrix metalloproteases.

Figure 1.

Ultrastructure of the extracellular matrix. Adjacent to an epithelial cell (E) the basement membrane with its lamina lucida (LL) and lamina densa (LD). The interstitial matrix contains collagen fibrils and is in close proximity to the basement membrane anchoring fibrils, composed of type VII collagen (scale bar 0.1 µm)

The collagen scaffold

Collagens are ubiquitous proteins responsible for maintaining the structural integrity of vertebrates and many other organisms 1. More than 20 genetically distinct collagens have been identified. The reader is referred to excellent recent reviews to obtain in-depth information on collagen structure and distribution 2–5.

In tissues that have to resist shear, tensile, or pressure forces, such as tendons, bone, cartilage, and skin, collagen is arranged in fibrils, with a characteristic 67 nm axial periodicity, which provide the tensile strength. Only collagen types I, II, III, V, and XI self-assemble into fibrils. The fibrils are composed of collagen molecules, which consist of a triple helix of approximately 300 nm in length and 1.5 nm in diameter. Collagen fibril formation is an extracellular process which occurs through the cleavage of terminal procollagen peptides by specific procollagen metalloproteinases. Some collagens form networks (types IV, VIII, and X). A typical example of such a network is the basement membrane, which is mostly made of collagen IV. Other collagens associate with fibril surfaces (types VI, IX, XII, and XIV). Yet other collagens are transmembranous proteins (types XIII and XVIII) or form periodic beaded structures (type VI).

Type I collagen occurs throughout the body, except in cartilage. It is the principal collagen in the dermis, fasciae, and tendons and is a major component of mature scar tissue. Type II collagen occurs in cartilage, the developing cornea, and in the vitreous body of the eye. Type III collagen dominates in the wall of blood vessels and hollow intestinal organs and co-polymerizes with type I collagen. Types V and XI collagen are minor components and occur predominantly co-polymerized with collagen I (type V) and collagen II (type XI).

Collagens are mostly synthesized by the cells comprising the extracellular matrix: fibroblasts, myofibroblasts, osteoblasts, and chondrocytes. Some collagens are also synthesized by adjacent parenchymal or covering (epithelial, endothelial, mesothelial) cells. A typical example is type IV collagen, which is synthesized in a cooperative effort between the stromal cell and the parenchymal/covering cell.

Structure and distribution of laminins

Laminin, together with type IV collagen, nidogen, and perlecan, is one of the main components of the basement membrane. What is now known as laminin 1 was first discovered over 20 years ago in the matrix formed in a murine sarcoma (the mouse EHS sarcoma). The molecule appeared to be between 200 and 400 kDa, was composed of three disulphide linked chains, and had a characteristic cross shape. Molecular cloning of the three chains (now known as α1, β3, and γ2) of laminin 1 led to the discovery of a variety of homologues. As yet, five α chains, three β chains and three γ chains have been identified 6–8. Not all possible combinations of the three chains appear to be used: 12 distinct laminin isoforms have been isolated. All laminin chains have certain structural features in common. They share small globular domains, of which one is involved in chain polymerization. They also have in common epidermal growth factor-like repeats which host the nidogen binding site. Nidogen links laminin to type IV collagen. Some structures are more chain specific: the α chains have a large C-terminal globular domain, which hosts many of the binding sites for integrins.

Laminin isoforms are synthesized by a wide variety of cells in a tissue-specific manner. Notably, virtually all epithelial cells synthesize laminin, as do smooth, skeletal, and cardiac muscle, nerves, endothelial cells, bone marrow cells, and the neuroretina. Epithelial cells express α1, α3, β3, and γ2 chains. The pattern of expression of α5, β1, β2, and γ1 chains is less specific. The synthesizing cells deposit laminin mostly but not exclusively in basement membranes.

Laminins appear to have an astonishing variety of effects on adjacent cells, including cell adhesion, cell migration, and cell differentiation. They exert their effects mostly through integrins, many of which recognize laminins, the integrin binding domain residing predominantly in the α chain 9. The primary role of laminins seems to be mediation of the interaction between cells and the extracellular matrix, notably the basement membrane 6. Some of these interactions are involved in modulating specific functions. Laminin 1, for example, induces differentiation in epithelial cells and laminin 2 promotes neurite outgrowth from neural cells. Laminin 5 appears to be involved in cell adhesion as well as in cell migration, and this pleiotropic function is dependent on proteolytic processing of laminin by plasmin or matrix metalloproteinases (notably MT1-MMP and MMP-2). Basal cells of the epidermis anchor to laminin 5 in the basement membrane under the formation of hemidesmosomes. Laminin 5 and 10 occur predominantly in the vascular basement membrane and mediate adhesion of platelets, leukocytes and endothelial cells.

Given the wide range of roles laminins play in tissue structure and cell function, it is not surprising that laminins are significantly involved in a variety of disease processes. The role of laminins in tumour invasion and metastasis and in angiogenesis has been intensely studied. These studies have shown that the dysregulation of the interaction between cancer cells and the extracellular matrix is accompanied by aberrant synthesis, chain composition, and proteolytic modification of laminins 10.

The tenascin family

The first member of the tenascin family of extracellular matrix proteins, which is now known as tenascin–cytotactin (TN-C), was discovered about 20 years ago 11. Several groups simultaneously identified the protein and different names were proposed, of which tenascin, proposed by Chiquet and Fambrough 12, finally survived. TN-C is a rather unique member of the family because it is the only one that forms hexamers. The hexamers are a result of interactions between the TA (tenascin assembly) domains at the N-terminus of the molecule; as a consequence a six-arm structure is formed (hexabrachion). To this end the TN-C molecule has a series of heptad repeats in the TA domain. Oligomerization is a two-step process. First a trimer is formed and the trimers are then assembled into a hexamer 13. The molecule is furthermore composed of an array of EGF-like repeats, various fibronectin type III domains and at the C-terminus a fibrinogen-like globular domain.

The other members of the family have been isolated because of similarities in the domain structure. Although the number of repeats varies significantly between the different members, the overall order of the domains is the same. They all have a TA domain, which allows them to form trimers, but no hexamers occur. The four other family members are TN-R, TN-W, TN-X, and TN-Y. TN-R occurs in oligodendrocytes during late development. TN-W is expressed in the pathway of migration of neural crest cells. TN-X is a prominent component of connective tissues. Deletions in the TN-X gene are associated with the Ehlers–Danlos syndrome. TN-Y is also expressed in connective tissues but in addition occurs in the brain.

The complex domain structure of the tenascins predicts that the molecule is capable of interacting with a variety of extracellular matrix proteins. This indeed appears to be the case 14. Cell surface receptors for tenascins include integrins, cell adhesion molecules of the Ig superfamily, a transmembrane chondroitin sulphate proteoglycan (phosphacan) and annexin II. TN-C also interacts with extracellular proteins such as fibronectin and the lecticans, a class of extracellular chondroitin sulphate proteoglycans including aggrecan, versican, and brevican. Furthermore, tenascins are recognized and cleaved by extracellular matrix proteases including serine proteases and matrix metalloproteases.

The patterns of expression of tenascins are rather complex. In particular, during embryogenesis the rate of tenascin synthesis changes significantly. In combination with the effects of tenascin on cell behaviour this indicates that tenascin in the extracellular matrix might have an important function in morphogenesis in determining whether or not specific cells will adhere to the matrix, and in doing so receive cues which modulate cell differentiation, migration, proliferation, or apoptosis. In normal adult tissues tenascins are not expressed. They appear, however, in a variety of pathological situations. TN-C expression is found in wound healing and in other situations of neovascularization, including tumour growth.

The integrin receptor family

Although the family of integrin receptors was discovered only 15 years ago 15, the key role of integrins in development, immunobiology, haemostasis, and in pathological processes including inflammation, cardiovascular disease, and cancer has been recognized rapidly and has already led to new modalities of treatment of these diseases. Integrins are found in almost all members of the animal kingdom. They all have the same basic αβ heterodimeric structure. In primitive organisms there may only be two integrins. In mammals the family comprises 3β and 18α subunits, which together make 24 distinct integrins 16. Integrins recognize short peptide motifs, the ligand specificity relying on both subunits of the heterodimer. The nature of the ligands varies significantly. The members of the β2 subfamily recognize counter-receptors belonging to the Ig superfamily and play a role in leukocyte trafficking, eg in the inflammatory reaction. The β1 subfamily recognizes extracellular matrix proteins such as fibronectin, collagen, and laminin. The latter is also recognized by the α6β4 heterodimer 17, 18.

Integrins were discovered by virtue of their function as (extracellular matrix or intercellular) adhesion molecules. In this function they provide mechanical continuity between the outside and the inside of the cell. To this end the cytoplasmic domain of integrins links with the actin microfilament system of the cytoskeleton. They do this through a variety of linker proteins, including talin, α-actinin, vinculin, and paxillin. Integrin–ligand interaction also triggers a spectrum of signal transduction pathways with profound effects on cell survival, cell proliferation, the structure and functional activity of the cytoskeleton (motility), and gene transcription. Cell adhesion through integrins, signalling through integrins, and signalling through receptors for soluble ligands are intricately interrelated. The incapacity of cells that do not adhere to a substrate through integrins to respond to growth factors such as EGF and PDGF is an example of the anchorage dependence of cell survival and proliferation.

A relatively new concept is that of integrin activation. It is evident that while some integrins are constitutively expressed on the cell surface, they need not exert their functional activity permanently. A good example of the need for regulated activity is leukocyte function in the inflammatory reaction. The β2 integrins are constitutively expressed on the leukocyte but not constitutively active, in order to avoid a permanent inflammatory state. Upon appropriate stimulation, however, integrins need to be activated. This takes place through regulatory signals from the interior of the cell, a mechanisms that has been called ‘inside-out’ signalling. This involves separation of the cytoplasmic tails of the α and β chains and probably leads to conformational changes in the extracellular domain of the heterodimer, allowing it to effectively interact with its ligand 16, 19.


The extracellular matrix components discussed so far are significantly involved in the matrix scaffold, either as structural elements, such as collagen, or as molecular ‘glue’ between the matrix and adjacent cells such as integrins and laminins. We have seen that the matrix is much more than a structuring tissue component, almost all classes of matrix molecules being involved also in control of proliferation, differentiation, and motion. In this panoply of signals from the extracellular matrix proteoglycans appear to play a prominent role, the significance of which has become evident only recently. It is therefore appropriate to dedicate a few words to this very complex category of molecules, even though in this review issue the role of proteoglycans in tissue processes will not be explicitly addressed.

Proteoglycans can be grouped into several families 20–23. All have a protein core which is richly decorated with glycosaminoglycans. A first family is constituted by the lecticans, of which the protein core has an N-terminal globular domain that interacts with hyaluronan, and a C-terminal selectin domain. The side chains consist mostly of chondroitin sulphate, although keratan sulphate may be present also. Members of this group are aggrecan, versican, neurocan, and brevican. Their function was until recently assumed to be structural, in providing tissue rigidity to resist compressive forces. It is now clear that versican, one of the most studied members of this group, stimulates the proliferation of fibroblasts and chondrocytes, through the presence in the molecule of EGF-like motifs.

The second proteoglycan family is characterized by a protein core which is composed of leucine-rich repeats, whence the acronym SLRPs. These provide a horseshoe-like structure, which favours protein–protein interactions. Their glycosaminoglycan side chains are mostly chondroitin/dermatan sulphate or keratan sulphate. Decorin, biglycan, fibromodulin, and keratocan are among the members of this family. These proteoglycans were until recently considered to function primarily as organizers of collagen networks. It has recently been demonstrated, however, that decorin is involved in signal transduction through the EGF receptor and downstream signalling through the MAPK pathway. This seems to be involved in modulation and differentiation of epithelial and endothelial cells. In addition, TGF-β interacts with members of this family, in particular with decorin, but the functional significance of this interaction remains unclear as yet.

An interesting emerging concept is that of the ‘part-time’ proteoglycans, a heterogeneous group of cell surface and matrix molecules comprising CD44, macrophage colony-stimulating factor, amyloid precursor protein and several collagens (IX, XII, XIV, and XVIII). These molecules may or may not be linked to glycosaminoglycan side chains, depending on as yet unknown regulatory mechanisms.

An important family is that of the heparan sulphate proteoglycans, part of which are matrix proteoglycans and part of which are membrane-associated. Perlecan and agrin are matrix heparan sulphate proteoglycans and are found predominantly in basement membranes. The syndecans and the glypicans are membrane-associated heparan sulphate proteoglycans. Syndecans have a transmembrane and a small cytoplasmic domain. Glypicans are tethered to the plasma membrane by a glycosylphosphatidylinisotol link. These have been shown to interact with high affinity with a variety of growth factors, form ternary structures with the growth factors and their receptors, and in this way modulate the response of receptor-bearing cells to growth-factor-induced signalling. Through this mechanism, membrane-associated heparan sulphate proteoglycans exert important effects on cell adhesion and migration and on proliferation and differentiation. Syndecans have been very extensively studied in this respect. The heparan sulphate moiety binds with high affinity a range of cytokines and growth factors, including members of the fibroblast growth factor family (FGF), hepatocyte growth factor (HGF), platelet-derived growth factor (PDGF), heparin-binding epidermal growth factor (HB-EGF), and vascular endothelial growth factor (VEGF). The extensive expression of heparan sulphate proteoglycans and their implication in a variety of cell physiological processes suggests that they might also be involved in a variety of pathological processes. Indeed, they have been implicated in the modulation of cell migration, proliferation 24, and differentiation in wound healing and in tumour growth 25.

Matrix metalloproteinases

As mentioned earlier, the ECM is subject to constant remodelling, a process that involves breakdown of existing, and synthesis and deposition of new ECM proteins. Numerous classes of proteolytic enzymes are believed to participate in ECM degradation, but one class that appears to play a dominant role is that of matrix metalloproteinases (MMPs). Matrix metalloproteinases are a subfamily of the larger metzincin family of metalloendopeptidases characterized in part by their requirement for zinc. The first MMP to be discovered was defined as having collagenolytic activity and the ability to hydrolyse the tadpole tail 26. This observation led to the subsequent discovery of related enzymes, and there are currently more than 20 characterized mammalian MMPs and numerous homologues in other organisms.

MMPs were initially named according to their perceived substrate specificity. Thus, MMPs are also known as collagenases (MMP-1, 8, 13), gelatinases (MMP-2 and 9), stromelysins (MMP-3, 10, 11), and matrylisins (MMP-7 and 26). However, it is now clear that these are misnomers because MMPs display a great deal of overlap in substrate specificity. Gelatinases A and B (MMP-2 and 9 respectively), for example, are capable of degrading collagen IV, V, and X, as is stromelysin-1 (MMP-3). Matrylisin-1 (MMP-7) degrades gelatin and collagen IV, and several MMPs hydrolyse fibronectin and laminin and matrix proteoglycans. It is generally accepted that in combination MMPs can degrade all ECM proteins.

MMPs play an essential role in physiological events, including development, hormone-dependent tissue remodelling, and tissue repair. However, they also play a key role in pathological conditions such as inflammation, tumour invasion, and metastasis. Several excellent recent reviews provide details as to MMP implication in both physiological and pathological settings 27–30.

Most of the known MMPs are secreted. They typically contain a secretory signal sequence, a pro-domain that maintains the zymogen status, and a catalytic domain that contains the zinc-binding active site consensus sequence. These components constitute the minimal MMP structure. Several secreted MMPs contain additional features such as a carboxy terminal hemopexin-like domain, a furin protease cleavage site in the pro-domain or a fibronectin-like repeat sequence in the catalytic domain. These structures are believed to confer processing and adhesive functions to MMPs, although the full spectrum of their properties remains to be elucidated. A subset of MMPs contains a single transmembrane domain followed by a cytoplasmic tail. The transmembrane domain ensures cell surface expression of these MMPs, which are also known as MT-MMPs.

MMPs can be produced by a broad range of cells, including normal epithelial cells, fibroblasts, myofibroblasts, chondrocytes, osteoclasts, endothelial cells, and leukocytes. Many malignant cells constitutively express a host of MMPs. In the adult organism in the resting state, there is very little MMP expression and activity. However, MMP expression is rapidly induced by a variety of cytokines and growth factors that participate in events associated with tissue remodelling. MMP activity is controlled at several levels, including transcription, proteolytic activation, localization, and interaction with inhibitors. Activation of MMPs typically occurs by proteolytic removal of the pro-domain. This is usually the result of the activity of other proteases, although autocatalytic activation appears to occur as well. Recent evidence suggests that secreted MMPs localize at least transiently to the cell surface by interacting with cell surface adhesion receptors and proteoglycans. Localization to the cell surface appears to be important in promoting cell-mediated ECM degradation and latent growth factor activation. It also appears that the cell surface may protect MMPs from their natural inhibitors, particularly tissue inhibitors of metalloproteinases (TIMPs) which are potent blockers of MMP activity and are secreted by a broad range of cell types and deposited in the ECM.

MMPs promote normal and malignant cell invasion of the ECM and it was long believed that the mechanism whereby they do so relies on the physical breakdown of the natural ECM barriers to cell migration. Although MMPs undeniably help create a migration path for invading cells, recent observations provide compelling evidence that MMP function goes well beyond the ability to merely cleave structural ECM proteins. Thus, in addition to augmenting the bioavailability of growth factors sequestered within the ECM, MMP proteolytic activity can activate latent secreted growth factors (MMP-2 and 9 can activate TGF-β, and several MMPs can release IGF from IGFBPs) and cell surface growth factor precursors, including TGF-α and HB-EGF. Moreover, MMPs have been found to proteolytically cleave cell surface growth factor, cytokine, and chemokine receptors as well as adhesion receptors. By so doing, MMPs can participate in controlling normal and tumour cell responses to growth factors, cytokines, and chemokines, as well as cell–cell and cell–ECM interactions.

Thus, from being thought to primarily provide a mechanism to degrade ECM proteins and regulate mechanical resistance to cell migration, MMPs are now being recognized as major regulators of normal cell physiology and tumour–host interactions.