The majority of the matrix protein TapA is dispensable for Bacillus subtilis colony biofilm architecture

Biofilm formation is a co‐operative behaviour, where microbial cells become embedded in an extracellular matrix. This biomolecular matrix helps manifest the beneficial or detrimental outcome mediated by the collective of cells. Bacillus subtilis is an important bacterium for understanding the principles of biofilm formation. The protein components of the B. subtilis matrix include the secreted proteins BslA, which forms a hydrophobic coat over the biofilm, and TasA, which forms protease‐resistant fibres needed for structuring. TapA is a secreted protein also needed for biofilm formation and helps in vivo TasA‐fibre formation but is dispensable for in vitro TasA‐fibre assembly. We show that TapA is subjected to proteolytic cleavage in the colony biofilm and that only the first 57 amino acids of the 253‐amino acid protein are required for colony biofilm architecture. Through the construction of a strain which lacks all eight extracellular proteases, we show that proteolytic cleavage by these enzymes is not a prerequisite for TapA function. It remains unknown why TapA is synthesised at 253 amino acids when the first 57 are sufficient for colony biofilm structuring; the findings do not exclude the core conserved region of TapA having a second role beyond structuring the B. subtilis colony biofilm.

B. subtilis biofilm is composed of polymers that include an exopolysaccharide (EPS) and an amphiphilic protein BslA which assembles to form a hydrophobic film on the biofilm surface (Kobayashi and Iwano, 2012;Hobley et al., 2013). The major protein component of the B. subtilis matrix is protease-resistant fibres formed by the secreted protein TasA (Romero et al., 2010;Erskine et al., 2018b).
These fibres provide structural integrity to the biofilm and are necessary for the characteristic wrinkled phenotype of biofilms and pellicles (Branda et al., 2006). Experimental models to study biofilm formation include colony biofilms grown on a semi-solid agar surface and floating pellicles in which the biofilm forms at an air-liquid interface in standing liquid culture.
The tasA coding region is located within an operon alongside two other genes: namely sipW and tapA (formerly yqxM) (Zhu and Stülke, 2018). SipW is a specialised signal peptidase that is linked with removal of the signal peptide from both TasA (Stover and Driks, 1999b) and TapA (Stover and Driks, 1999a) during the secretion process. SipW also modulates expression of the epsA-O operon which encodes the proteins needed for the biofilm exopolysaccharide (Terra et al., 2012). Thus, sipW is essential for biofilm formation. TapA is described as an accessory protein that is thought to be needed for the formation of TasA fibres (Romero et al., 2010;Romero et al., 2011), and more specifically, for the attachment of the TasA fibres to the cell surface. Secondary structure analysis revealed TapA to be a two-domain protein with significant regions of disorder (Abbasi et al., 2019). Additionally TapA has recently been shown to form fibres (El Mammeri et al., 2019). The absence of TapA is correlated with a reduction in the level of TasA in the biofilm matrix (Romero et al., 2014). Evidence indicates however that, in the absence of TapA, recombinant TasA self-assembles into protease-resistant fibres that are structurally and functionally comparable to native fibres extracted from B. subtilis (Erskine et al., 2018b;El Mammeri et al., 2019). Moreover, when provided exogenously these selfassembled recombinant TasA fibres are also biologically active in vivo in the absence of TapA, suggesting that cell-surface attachment is not critical for biofilm architecture (Erskine et al., 2018b). Thus, further evaluation of the function and activity of TapA is warranted.
Here we identify that amino acids 1-57 (inclusive) of TapA form a minimal functional unit of the protein that is required to give rise to the complex architecture of the B. subtilis colony and pellicle biofilm.
Heterologous provision of the DNA encoding this truncated form of TapA is sufficient to restore rugose biofilm formation to the tapA deletion strain (the full-length protein is 253 amino acids in length).
We identify, through site-directed mutagenesis, key amino acids in the minimal, functional TapA form that are required for bioactivity and, in doing so, uncover essential hydrophobic amino acids. We show that in vivo TapA is proteolytically cleaved to lower molecular weight forms by the native extracellular proteases secreted into the external environment. We demonstrate that Vpr, a serine protease, plays a specific role in cleavage of the TapA protein. Finally, we establish that proteolysis of TapA by the extracellular proteases is not a prerequisite to activity and that TapA can fulfil its role whether it is cleaved or not.

| Identification of a TapA minimal functional unit
To investigate the functional region(s) of tapA an in-frame deletion was constructed in B. subtilis NCIB3610. As expected (Romero et al., 2011), the rugose architecture exhibited by the colony and pellicle biofilms formed by the parental isolate was absent when tapA was deleted (Figures 1a and S1a). The ΔtasA strain is shown for reference (NRS5267) revealing the flat featureless biofilm that manifests when the TasA fibres are no longer synthesised. The architecture of the ΔtapA strain was fully reinstated when the tapA coding region was expressed from the heterologous amyE locus using an IPTG inducible promoter (Figures 1a and S1a). The tapA gene is present in the genome of a range of Bacillus species and analysis of the protein sequences reveals domains with a high degree of conservation and other regions of variability ( Figure 1b). This includes a marked difference in the length of the tapA coding region (Figure 1b) (Romero et al., 2014); experimental data showed that the coding region for amino acids 194-230 of TapA was dispensable (Romero et al., 2014).
Here, to determine the minimal tapA coding region needed for function, we systematically deleted tapA from the 3' end and tested the ability of the variant-length tapA constructs to genetically complement the tapA deletion strain. In total, 22 variants of the tapA coding region were assessed (Figures 1b,c and S1b). We concluded that the TapA 1-60 variant was fully capable of recovering architecture of the colony and pellicle biofilm to the tapA mutant, while the TapA 1-50 form lacked this ability ( Figure 1c). We, therefore, made constructs with single codon deletions in the region encoding TapA amino acids between 60 and 50 (see Figure 1b). We established that amino acids 1-57 (inclusive) represented the minimal form of TapA that is capable of reinstating colony and pellicle biofilm architecture to the tapA deletion strain. This conclusion was reached through a visual analysis of colony biofilm and pellicle rugosity (Figures 1c and S1b,c). Additionally, as the level of TasA (calculated molecular mass 25.7 kDa) in the biofilm is substantially reduced in the absence of functional TapA (Romero et al., 2014) (Figure S1d), the bioactivity of the tapA truncations was supported by immunoblot analysis which showed a recovery of TasA levels back to those seen for NCIB3610 ( Figure S1d). The identification of this region of tapA as sufficient for TapA activity is consistent with, but significantly extends, the previous identification of amino acids 50-57 as being needed for TapA function in the context of the full-length protein (Romero et al., 2014).

| Amino acids critical for function in the minimal functional region of TapA
We were interested in the features within the TapA minimal form that conferred activity. The in silico predicted TapA signal sequence comprises the first 43 amino acids (Petersen et al., 2011) that is purported to be cleaved by a specialised signal peptidase, SipW (Stover and Driks, 1999a). Here we demonstrate that the first 43 amino acids are not needed for activity, as the 43 amino acid sequence can be replaced with the 28 amino acid TasA signal sequence (which is also cleaved by SipW; Stover and Driks, 1999a;Stover and Driks, 1999b). More specifically, when the chimeric construct, P IPTG -tasAss-tapA 44-253 , was expressed in the tapA deletion strain rugose architecture was fully reinstated to the colony and pellicle biofilms ( Figure S2a,b). Therefore, the influence of amino acids 1-43 of TapA with respect to protein function were excluded from further analysis. with, but extend those previously published that revealed tapA from B. amyloliquefaciens was able to functionally replace TapA of B. subtilis (Romero et al., 2014).
To identify amino acids critical to TapA function, site-directed mutagenesis was used to generate a series of constructs containing systematic substitutions in the tapA 44-57 coding region in the context of the minimal TapA 1-57 construct (Figure 2c). The variant constructs were introduced into the tapA deletion strain and the ability of the variant TapA forms to restore architecture to colony biofilms was assessed (Figure 2d-r). One key finding uncovered was that the length of the TapA 1-57 variant form is the important feature driving activity F I G U R E 1 The tapA coding region is functional when truncated. (a) Colony biofilms formed by NCIB3610, ΔtapA (NRS3936), +P IPTG -tapA (NRS5045) and ΔtasA (NRS5267); (b) Alignment of TapA protein sequences from B. subtilis, B. pumilus, B. amyloliquefaciens and B. paralicheniformis. The percentage amino acid sequence identity with regards the B. subtilis TapA sequence is as follows: B. pumilus-42%, B. amyloliquefaciens-49% and B. paralicheniformis-38%. The bold underlined sequence represents the signal sequence, the black boxes indicate identical amino acids; The constructs generated and the ability to restore rugose biofilm architecture to the ΔtapA deletion strain when expressed from an ectopic position on the chromosome are indicated by the inverted triangles above the amino acid sequence; (c) Representative colony biofilms formed by + P IPTG -tapA 1-60 (NRS6044), P IPTG -tapA 1-50 (NRS6002), +P IPTG -tapA 1-59 (NRS6043), +P IPTG -tapA 1-58 (NRS6042), +P IPTG -tapA 1-57 (NRS6041), and + P IPTG -tapA 1-56 (NRS6025). In (A) and (C) biofilms were grown at 30°C for 48 hr in the presence of 25 µM IPTG. Biofilm images are representative of at least three independent biological replicates. The scale bars represent 1 cm [Colour figure can be viewed at wileyonlinelibrary.com] of TapA 1-57 bioactivity, as provision of the tapA coding region for amino acids 1-56 was unable to support biofilm recovery (Figure 1c), whereas that encoding TapA 1-57 or a variant of TapA 1-57, where threonine 57 was replaced with alanine were biologically active (Figures 1c and 2c). We also identified amino acids F45, D47, F51, D52, V53 and L55 as critical for TapA function, where the hydrophobicity of the amino acids V53 and L55 was a key feature mediating biological activity. Further analysis will be required to elucidate the exact role the amino acids play in facilitating TapA function (Table 1).

| TapA is detected in vivo at a low molecular mass
As the truncated variant of TapA covering amino acids 1-57 (inclusive) is sufficient for the mature architecture of the colony biofilm to develop, we hypothesised that TapA could be processed in vivo to a smaller, potentially active, form. To probe the molecular weight of TapA in the colony biofilm we raised a custom TapA antibody that was able to detect recombinant TapA 44-253 protein as a single band at ~30 kDa by immunoblot (calculated molecular mass

| The C-terminus of TapA is processed in vivo
To understand in more detail the banding pattern detected when using the αTapA antibody to challenge protein extracts of the WT strain, we assessed the immunoblot banding profile for the tapA mutant that expressed either the tapA   Green arrows represent predicted β-strands, residues critical for function in B. subtilis TapA 1-57 are shown in colour: hydrophobic residues are in blue; polar residues are in orange; (b) representative colony biofilms formed by + P IPTG -tapA B_pum (NRS5046), +P IPTG -tapA B_amy (NRS5047), and + P IPTG -tapA B_para (NRS5741). n > 3. Biofilms formed by (c) +P IPTG  As a difference in the apparent mass of at least one of the bands detected in the tapA 1-253 or tapA 1-193 samples was not observed. The simplest explanation is that the 30kDa band is a non-specific band, and therefore the only forms of TapA detected are the 18 kDa and 16 kDa forms. These findings allow us to deduce that TapA is likely to be processed in the WT strain such that the C-terminal domain is removed. These data also demonstrate that the ~30 kDa band is not specific to TapA, but rather is a non-specific protein detected by the TapA antibody that is dependent on a functional form of TapA being made by the cell.
The reason why TapA 1-183 (and other forms with more extreme truncations) cannot be detected by immunoblot can be informed by analysis of the recently released crystal structure of TapA  (PDB 6HQC; Roske et al., 2018). TapA 75-190 forms a β-sandwich most structurally similar to the macroglobulin folds found in bacterial lipoproteins from Escherichia coli (PDB 4ZIQ; Garcia-Ferrer et al., 2015) and Salmonella typhimurium (PDB 4U4J; Wong and Dessen, 2014) ⍺-2-macroglobulins ( Figure 3f). The structure of TapA contains a disulphide bond between Cys-92 and Cys-188. Therefore, the TapA 1-183 variant protein would lack the disulphide bond and would lose the C-terminal β-strand that would disrupt one of the β-sheets. Thus, the TapA 1-183 construct is highly likely to render the usually folded domain unstable, and therefore more prone to degradation. This would result in TapA being undetectable by immunoblot. We reason that as TapA is active as a minimal unit of 57 amino acids, this N-terminal portion of the protein must remain present, albeit not detectable, in F I G U R E 3 TapA is processed in vivo. (a) Immunoblot analysis of 50 and 100 ng of recombinant TapA 44-253 using αTapA antibodies; (b) Immunoblot analysis of proteins extracted from biofilms formed by NCIB3610, ΔtapA (NRS3936), +P IPTG -tapA (NRS5045) (in the absence or presence of 25 µM IPTG as indicated) using αTapA antibodies. n = 2; (c) Biofilms formed by B. subtilis isolates NCIB3610, RO-FF-1, ATCC 9799, and B-14393T after growth at 30°C for 48 hr. Biofilm images are representative of at least three independent replicates. The scale bars represent 1 cm; (d) Immunoblot analysis of proteins extracted from biofilms formed by NCIB3610, RO-FF-1, ATCC 9799 and B-14393T using αTapA antibodies. n = 3; (e) Immunoblot analysis of proteins extracted from biofilms formed by NCIB3610, ΔtasA (NRS5267), ΔtapA (NRS3936), +P IPTG -tapA 1-253 (NRS5045), +P IPTG -tapA 1-193 (NRS5744), +P IPTG -tapA 1-183 (NRS5790) and + P IPTG -tapA 1-57 (NRS6044) using αTapA antibodies. About 25 µM IPTG was used as indicated, n = 2; Arrow 1 highlights the ~30 kDa band, arrow 2 the ~18 kDa band, and arrow 3 the ~16 kDa band. (f) Linear schematic of TapA outlining the predicted domains: signal peptide (SP) in grey; minimal functional unit (mfu) in yellow; and the crystal structure (residues 75-190, PDB 6HQC) in teal and light blue. The one disulphide bond, between Cys92 and Cys188, is shown in stick representation. The region of the folded domain that would be missing in the tapA 1-183 construct is in light blue, in both the linear and tertiary structure representations. The crystal structure was visualised using PyMOL 2.0 [Colour figure can be viewed at wileyonlinelibrary.com] the colony biofilm formed by strains that express either tapA 1-183 or tapA 1-57 . The development of a rugose architecture of a colony biofilm when TapA bands cannot be detected using the αTapA antibody indicates that the bands are not representative of the region of the protein responsible for conferring rugose architecture.

| Recombinant TapA is cleaved by extracellular proteases
The consistent detection of TapA at low molecular mass forms in the immunoblots is suggestive that in vivo TapA is subjected to processing. These findings led us to the hypothesis that TapA    To test the potential role of the extracellular proteases on TapA cleavage in vivo, we constructed a strain that lacked eight secreted exoproteases (hereafter KO8: NRS5645 bpr,vpr,nprB,mpr,epr,aprE,nprE and wprA). The KO8 strain showed no evidence of proteolytic activity when grown on LB agar for 20 hr that was supplemented with 1.5% (w/v) milk, similar to the degU deletion strain which was used as a negative control ( Figure S3a) (Msadek et al., 1990). Moreover, recombinant TapA  Taken together these data reveal that TapA is cleaved in vivo at two distinct positions, one position which generates the ~18 kDa form and another position which yields the ~16 kDa form.

| Vpr cleaves TapA in a specific manner
Cleavage of TapA into the ~16 kDa form was completely blocked in the KO8 strain. To elucidate which exoprotease was responsible we examined the TapA banding profile in the suite of strains built during the construction of the NCIB3610 KO8 strain. These are called strains KO1 to KO7 (see Table S1). We noted that the ~16 kDa TapA band was not detected in the protein samples extracted from colony biofilms once the coding region for vpr was deleted (Figure 4d).
Consistent with this, in a single vpr::erm NCIB3610 strain (NRS7010) only one αTapA antibody reactive band was detected after 48 hr growth under biofilm formation conditions, the band at ~18 kDa ( Figure 4e). We, therefore, conclude that Vpr is needed for the generation of the ~16 kDa form of TapA.

| Cleavage of TapA by the extracellular proteases is not needed for biofilm architecture
Having demonstrated that TapA is processed in the extracellular environment, we asked if processing was a prerequisite for the architecture of the B. subtilis colony and pellicle biofilms to develop.
We found that the KO8 strain displayed a minor defect in colony biofilm architecture, with the exoprotease-free strain showing fewer of the large corrugations seen in the colony biofilm formed by the parental strain (Figure 5a). The macroscale images of the pellicle biofilms formed by the two strains were largely comparable (Figure 5a).
Imaging of the colony biofilms at the microscale using confocal microscopy revealed very little difference in comparison to the WT strain in terms of wrinkle formation at the microscale, however, there was a small, but consistent, difference in the overall height of structures in the biofilm (Figures 5b and S3b). Images of the biofilms were We reasoned that if the differences in the rugosity displayed by the KO8 strain compared with WT were due to a lack of TapA cleavage, introduction of the minimal functional unit coding region of tapA, TapA 1-57 , at the heterologous amyE location would recover biofilm architecture. However, we found that TapA 1-57 was unable to reinstate the larger architectural features to the KO8 biofilms ( Figure 5d). We confirmed that the TapA 1-57 variant form was functional in the KO8 strain as both the full-length and minimal tapA coding region could restore the colony biofilm morphology displayed by the KO8 ΔtapA biofilms, such that the phenotype mimicked the parental KO8 strain ( Figure S3c). In addition, we assessed the direct impact of lacking Vpr using a vpr single deletion strain. This strain is unable to cleave TapA to yield the ~16 kDa form. The vpr deletion isolate retained a colony biofilm morphology that could not be distinguished from the WT NCIB3610 ( Figure S3c). Therefore, we conclude that cleavage of TapA, at least to a 16 kDa form, is not a prerequisite to generate functionally active TapA, as assessed by structure of the colony and pellicle biofilm. Thus, the subtle impact on biofilm architecture which manifests upon removal of the eight extracellular protease genes in the KO8 strain must be TapA independent.

| D ISCUSS I ON
An extracellular matrix composed of polymers is critical to structured sessile communities of bacterial cells called biofilms. In B. subtilis, this adhesive matrix is needed for assembly of both pellicle and colony biofilms in addition to the attachment to plant roots (Beauregard et al., 2013). Biofilm matrix production by B.
subtilis depends on the protein TapA, which has a role in promoting TasA stability and fibre formation in vivo (Branda et al., 2006;Romero et al., 2010;Romero et al., 2014;Abbasi et al.,  proteins, but the significance of this shared similarity is unknown.
Finally, we cannot rule out the possibility that amino acids 58-253 of TapA serve a distinct function beyond promoting rugose biofilm formation, but what role it may play is undefined.
The cleavage of TapA to lower molecular weight forms is con- as efficiently as seen in the WT strain ( Figure 6), or that this cleavage can occur via a non-enzymatic reaction over an extended timeframe.
Vpr plays a specific role in the cleavage of TapA and is a serine protease that is part of the family of subtilisin-like proteases first noted using an unbiased screen seeking to identify extracellular proteases produced by B. subtilis (Sloma et al., 1991). Production of Vpr is controlled at the level of transcription by the regulator activator CodY (Barbieri et al., 2015), and repressor DnaA (Smith and Grossman, 2015). Transcription of vpr is also promoted under conditions of phosphate starvation (Allenby et al., 2005). Vpr has not been specifically linked with growth or sporulation under nutrient-rich condition and is part of a family of proteins that are broadly classified as being used for nutrition acquisition (Rawlings et al., 2010).
However, Vpr is needed to produce quorum sensing signalling molecules, with a direct role in the generation of the five amino acid signal CSF from proCSF in the extracellular environment (Lanigan-Gerdes et al., 2007). Therefore, our findings broaden the social biology behaviours exhibited by B. subtilis that Vpr is associated with.
The activity of extracellular proteases is linked with biofilm matrix formation in other species of bacteria. For example, the matrix protein RmbA of Vibrio cholerae specifically binds the biofilm exopolysaccharide in a manner dependent on its structural configuration which promotes biofilm formation (Fong et al., 2017). However,

RmbA is cleaved during biofilm formation by the HapA, PrtV and
IvaP proteases (Berk et al., 2012;Hatzios et al., 2016), which releases a form of RmbA that allows for recruitment of both exopolysaccharide-producing and exopolysaccharide-nonproducing cells to the growing community . The exact consequences for the processing of RmbA on biofilm formation in natural communities remains to be addressed. Based on this precedent, we speculated that TapA processing may generate an active variant of TapA and tested the impact of deleting the genes encoding the extracellular proteases on biofilm formation. We found that TapA processing to lower molecular weight forms was altered in the KO8 strain lacking Bpr, Vpr, NprB, Mpr, Epr, AprE, NprE and WprA. However, in contradiction to our hypothesis, the colony biofilm architecture differences exhibited by the KO8 strain could not be mitigated by expression of the minimal functional coding region of tapA. We, therefore, conclude that the difference in KO8 colony biofilm architecture alteration is likely to be due to pleiotropic effects of deletion of the exoprotease genes, rather than being a specific impact of altered TapA processing. It is possible that the lack of extracellular proteases could impact the abundance of quorum sensing peptides in the extracellular environment. This is in line with evidence demonstrating that exoproteases control both the production and degradation of the quorum sensing signalling molecule ComX ). An accumulation of quorum sensing peptides in the biofilm microenvironment may have a global impact on the physiology of the KO8 strain due to an alteration in cell signalling pathways (Miller and Bassler, 2001;Kalamara et al., 2018). The susceptibility of TapA to cleavage by the exoproteases raises the possibility that TapA is degraded (recycled) after it has fulfilled its role in biofilm formation. The alternative notion is that TapA plays a second role in B. subtilis physiology that is yet to be elucidated. This hypothesis is strengthened if you take into account that this is the conserved and stable core of the protein.

| CON CLUDING REMARK S
Through this analysis we have expanded our knowledge of the protein TapA which is needed for biofilm formation by B. subtilis. We have revealed that TapA 1-57 is a key component of the functional form of TapA in vivo, allowing rugose biofilm architecture to manifest. In contrast, the main body of the protein is entirely dispensable in this experimental set up, despite it containing the most conserved amino acid sequences. The exact mechanism by which TapA is needed to stimulate TasA stability in vivo requires more elucidation, but the amino acids that were found to be critical for function between amino acids 45-57 provide a starting point for analysis.

F I G U R E 6
Schematic of TapA processing in B. subtilis. The full-length secreted form of TapA is predicted to have a molecular weight of 24.18 kDa. The 18 kDa band observed by αTapA immunoblot is generated by the cleavage of the C-terminus either by an as yet unidentified protease or by a non-enzymatic process. Processing at the N-terminus by the exoprotease Vpr releases a ~16 kDa band leaving the structured, β-sandwich, core of the protein intact. The minimal functional unit (mfu) is shown in pale orange. The region for which there is structural information is shown in green. The approximate location of protease cutting sites are shown as triangles. The minimal functional form of the protein still conferred structure to biofilms, confirming that the bands detected by immunoblot do not represent the important functional unit of the protein in this context [Colour figure can be viewed at wileyonlinelibrary.com] We have shown that TapA is processed in two distinct positions in vivo but that cleavage by the self-produced exoproteases is not an essential step to generate a functional protein form. We have also identified Vpr as a protease with a specific role in cleavage. Despite this, exactly how TapA functions to promote biofilm formation and the physiological role of the core of TapA remains to be elucidated.

| Growth media and additives
Lysogeny broth (LB) was prepared using 10 g of tryptone, 10 g of

| Strain construction
Strains used in this study are detailed in Table S1. Complementation alleles and antibiotic resistance cassette marked gene deletions were moved between strains using either SSP1 mediated phage transduction (Verhamme et al., 2007)  The ΔtapA deletion strain was generated by allelic exchange in a method similar to that previously published using the pMAD plasmid (Arnaud et al., 2004). Briefly, a 395 bp upstream region was amplified by PCR with primers NSW1308 and NSW1332 and a 641 bp downstream region was amplified using primers NSW1333 and NSW1334. Both fragments were cloned into the pMini-MAD vector (Patrick and Kearns, 2008) to generate plasmid pNW685.
To construct the in-frame deletions in all eight genes encoding the secreted proteases B. subtilis NCIB3610 comI (Patrick and Kearns, 2008), the BKE collection was utilised (Koo et al., 2017) where single gene deletions have been replaced with a cassette providing resistance to erythromycin. Genomic DNA was extracted from strains in the BKE collection and used to transform competent B. subtilis 3610 comI (Konkol et al., 2013) before selection on LB erythromycin plates. The erythromycin cassette contains lox sites and was subsequently removed leaving a 150 base pair scar by the action of a Cre recombinase, which was expressed on the heat-sensitive plasmid pDR244. In cases in which transformation with genomic DNA proved unsuccessful then the mutation was introduced by phage transduction with SPP1 phage. All the strains were examined using PCR and DNA sequencing to ensure the specificity in the region deleted from the chromosome. The intermediate strains are fully detailed in Table S1. The in-frame tapA deletion was introduced using plasmid pNW685 described above.
The variant tapA coding regions were introduced into B. subtilis chromosome at the amyE locus. Double recombination events were identified by assessing the production of α-amylase on LB growth medium supplemented with 1% (w/v) soluble starch.

| Plasmid construction
Plasmids (Table S2) were constructed using standard methods using the primers detailed in Table S3.

| In vivo analysis of exoprotease production
Detection of exoprotease production was conducted as previously described (Verhamme et al., 2007) using LB agar plates supplemented with 1.5% (w/v) dried milk powder. Strains were grown in 3 ml of LB broth at 37°C to an OD 600 of ~1. The cultures were normalised and 10 µl of the prepared cell culture spotted onto the plate. The samples were then grown for 20 hr at 37°C prior to photography.

| Biofilm growth and analysis
Biofilm colonies were prepared by growing B. subtilis in 3 ml of LB broth at 37°C with aeration for ~3.5 hr. After which, 10 µl of the culture was spotted onto an MSgg agar plate that was incubated at 30°C for 48 hr. Biofilm pellicles were prepared by growing B. subtilis in 3 ml of LB broth at 37°C with aeration for ~3.5 hr. After which, 4 µl of the culture were inoculated into 2 ml of MSgg liquid in a 24well plate that was incubated at 25°C for 72 hr. Imaging used a Leica MZ16 stereoscope (Leica Microsystems). About 25 µM IPTG was included in the growth medium to induce expression from the P spank promoter as indicated.

| Protein extraction from biofilms
Biofilms were isolated from MSgg plates with a sterile loop and suspended in 250 µl of BugBuster solution (Millipore) using a syringe with a 23 x 1-gauge needle until dispersed. The samples were sonicated at an amplitude of 20% power for 5 s. Sonicated biofilms were incubated at 26°C for 20 min with shaking at 1,400 rpm before centrifugation for 10 min in a benchtop centrifuge at 13,000 rpm.
The liquid phase was retained for further analysis by SDS-PAGE and/ or immunoblot. Protein concentration of biofilm lysates was determined by measuring absorbance at 280 nm (NanoDrop spectrophotometer) or using the DC protein assay (BIO-RAD) which is based on the Lowry assay (Lowry et al., 1951).

| Protein purification
Recombinant B. subtilis TapA 44-253 , mTasA (Erskine et al., 2018b) and fTasA (Erskine et al., 2018b) proteins were produced and separated from a Glutathione S-transferase-tag with a tobacco etch virus (TEV) protease-cleavage site using the pGEX-6P-1 system. The pGEX-6P-1 plasmid carrying the gene encoding the protein was introduced into E. coli strain BL21 (DE3) pLysS. The cells were grown overnight in 5 ml of LB broth and used to inoculate auto-induction media (Studier, 2005) supplemented with ampicillin (100 µg/ml) at a ratio in 25 ml of purification buffer supplemented with 1 mM DTT and 0.5 mg of TEV protease to release the protein from the GST-tag.
The solution containing target protein, TEV protease, free GST and the beads was loaded onto the gravity flow column. This removed the used beads which stay on the column. The flow-through was added to 250 µl of Ni-nitrilotriacetic acid agarose (Qiagen) slurry to remove the TEV protease and 750 µl of glutathione sepharose 4B to remove the free GST-tag. The mixture was incubated at 4°C with agitation overnight, and then passed through a gravity flow column.
The purified protein was concentrated using a Vivaspin 20 concentrator (with a MW cut-off of 5,000 or 10,000 Sartorius). The protein concentration was determined by measuring absorbance at 280 nm (NanoDrop spectrophotometer) and then analysed by separating ~30 µg by SDS-PAGE.

| Protein stability in culture supernatant
To collect spent culture supernatant the following process was used.
Initially, lawn plates were set up by suspending a single colony in

| Antibody production
A custom antibody that could be used to detect TapA from B. subtilis was raised in a rabbit using purified recombinant TapA 34-253 as the antigen (Eurogentec). The antibodies specific to recombinant TapA 34-253 were purified from the serum using standard methods by the MRC Protein reagents and services team (https://mrcpp ureag ents.dundee.ac.uk/our-servi ces/custo m-antib ody-produ ction).

| Immunoblot analysis
Samples to be analysed by immunoblotting were separated by SDS-PAGE under denaturing conditions. The proteins were transferred to hydrophobic polyvinylidene difluoride (PVDF) membranes (Immobilon-P [Millipore]) by electroblotting. The membranes were first blocked with a 5% (w/v) semi-skimmed dry milk solution in TBS-tween 0.2% (v/v) for at least 1 hr at room temperature or overnight at 4°C. After which the membrane was incubated with the primary antibody overnight at 4°C (dilution as follows: 1:5,000 αTapA, 1:25,000 αTasA). After incubation with the primary antibody, the membrane was washed three times with TBS-tween 0.2% (v/v) to the remove unbound primary antibody and incubated with the species-specific secondary HRP-conjugated antibody (Goat α Rabbit, dilution 1:5,000) for 1 hr at room temperature in TBS-tween 0.2% (v/v). The wash steps were repeated before development was induced with Enhanced Chemi-Luminescence reagents (ECL; BioRad Clarity). An electronic imaged was captured using the GeneGnome (SynGene) system.

| Bioinformatics analysis
Protein sequences of TapA homologues were aligned using Clustal Omega using default settings (Sievers et al., 2011). Percentage identity between TapA orthologues was calculated with reference to B.
subtilis TapA using the pairwise alignment function on the Jalview 2 workbench (Waterhouse et al., 2009). For the prediction of signal peptides for all TapA variants then the SignalP 4.1 server was used and set to the organism group 'Gram-positive' (Petersen et al., 2011).
The crystal structure of TapA  was retrieved from the Protein Data Bank (PDB 6HQC) (Roske et al., 2018). The DALI server (Holm, 2019) was used for structural comparison to all structures in the Protein Data Bank. Structural visualisation was done using PyMOL 2.0 (The PyMOL Molecular Graphics System, Version 1.2r3pre, Schrödinger, LLC).

| Confocal microscopy
For confocal microscopy, two biological replicates of WT and KO8 strains (NRS5634 and NRS6991 respectively) constitutively expressing the coding region for the Green Fluorescent Protein (GFP) were grown in liquid LB at 37°C and 200 rpm for approximately 4 hr.
The cell density was then normalised to an OD 600 of 0.9. From each of these biological repeats three technical repeats were prepared by depositing 1 µl of cells onto MSgg agar 1.5% (w/v) plates and incubated at 30°C. Imaging was performed on a Leica SP8 upright confocal with a 488 nm excitation laser at 2% power and a 10x 0.3NA dry objective, detecting emission wavelengths from 490 to 600 nm with the pinhole set to 1AU for 525 nm. A system-optimised z-step of 3.88 µm was used to gather a z-stack encompassing the structures of a field of view in the central region of each biofilm at 12, 18, 24 and 48 hr after the cells were initially deposited.

| Image analysis
Image data were stored in an OMERO server (Allan et al., 2012) and analysed in Matlab 2016b. For each image, the z-stack was downloaded into Matlab using the OMERO.matlab toolkit (Allan et al., 2012) and segmented using Otsu thresholding (Otsu, 1979). The xz orthogonal plane at each y position of the segmented image was then analysed for the 'top-most' segmented pixel in z, and these values were used to build up a single xy matrix of z positions representing the structure height at every pixel location. To calculate the volume of structures in the centre of biofilms at each time point the product of height matrix and the voxel size was taken. The code is available: https://github.com/mport er-gre/Calcu latio n-of-Featu re-Volum es-of-Biofilms. Data were categorised by strain and time point and represented as scatterplot drawn in GraphPad Prism 8. We are grateful to Dr. Laura Hobley for the tapA deletion strain, Dr.

ACK N OWLED G EM ENTS
Laura D'Ignazio for plasmid pNW1600 and Rachel Gillespie for construction of several strains and plasmids.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from the corresponding author upon reasonable request.