The dynamic versus the stable team: The unspoken question in large‐scale agile development

The importance of the team, its internal dynamics, and its performance are widely recognized within the software engineering community. While popular frameworks identify wholeness, stability over time, and smallness as important factors, they offer little guidance on how to form teams that achieve these three characteristics. The objective of this study is to investigate how these team characteristics interact in large‐scale software development contexts, particularly focusing on the impact of stable and dynamic teaming approaches. This was done through a multivocal study of literature, followed by individual semi‐structured interviews with 19 engineers from two companies and validation workshops with an additional two companies from unrelated industry segments. The study results show that the question of stable versus dynamic approaches to forming software engineering teams is largely unaddressed in industry, with stable teams representing a habitual default option. Meanwhile, both stable and dynamic teams clearly have respective strengths and weaknesses, calling for careful consideration of the most suitable approach in any given situation. To support such consideration, this paper presents a model of how team stability, wholeness, and smallness interact. This model is found relevant, accurate, generalizable, and useful by practitioners.


| INTRODUCTION
Agile and related methodologies, such as eXtreme Programming and DevOps, while new and sometimes controversial when they were first proposed, are nearly ubiquitous in today's software engineering industry-arguably to the point where they are not debated, but merely taken for granted.Even so, interpretation and application of these methodologies can vary: As Philippe Kruchten points out, 1 Agile concepts experience a rapid drift as they spread, and this is also true of related practices under the larger Agile umbrella. 2Regardless of differences in their interpretation, the team as the basic organizational building block is a salient feature of these practices.Much has been written about the Agile team, such as how to achieve productivity, 3 quality, 4 or job satisfaction, 5 or how to be effective in a distributed team. 6Work has also been done on how to collaborate within the team, with methods such as pair programming 7,8 or mob programming. 9Emphasis is often placed on the ability of this team to be cross-functional and manage the entire value stream, 10 particularly in the context of DevOps 11 (understood as a combination of values, principles, methods, practices, and tools 2 ).
In the available literature on Agile methodologies, there is frequently an implicit assumption-and sometimes explicit, as in the case of Scrum, 12 the Spotify model, 13 or SAFe 14 -of this team as a fairly stable and long-lived construct, resulting in cohesion and psychological safety while tracking and gradually improving its productivity (or velocity) over time, 15 but there is an established research gap when it comes to the impact of team longevity on team outcomes. 16,17

.1 | Teaming in large-scale development
There is ample evidence of the efficacy of this approach of teams as stable and long-lived in similarly stable and predictable development contexts, within a problem domain that is small enough for a team of reasonable size to "own" it.Indeed, many of the examples given in literature of an Agile team are taken from domains such as mobile device apps 18,19 or websites, 20 where even if requirements may change, the technology stack is well understood, and new requirements can in a straight-forward manner be mapped to an extant team possessing the necessary skills and expertise to handle it.
From the very early days, it has been suggested that Agile methodologies work best in a small context, though, 21 and various frameworks for scaling Agile methods, such as Disciplined Agile Delivery (DAD), 22 the Spotify model, 13 Scaled Agile Framework (SAFe), 14 and Large Scale Scrum (LeSS), 23,24 have been proposed to better apply Agile principles and methods in large-scale development, 25 recognizing that a larger scale implies particular challenges. 26These frameworks do not address the question of how to compose the team (or squad, as it is called in the Spotify model) such that all necessary skills and expertise is contained within it, however.Rather, that such teams exist is presented as a given starting condition, for example, stating that "they have all the skills and tools needed […] and no blocking dependencies to other squads" 13 or that "Agile teams have all the skills necessary to define, build, test, and deploy value in short iterations." 14These teams can then be assigned tasks "based on their interests, strengths and desire to group related items." 23

| Forming good teams
The frameworks mentioned above stress different aspects of the team formation process, such as self-organization based in shared culture and aims 27 and setting the conditions for them to self-manage, 24 Disciplined Agile Delivery (DAD) 22 provides the most explicit guidance on team formation, claiming that • Teams should be as small as possible; • Teams need to be stable so that they may evolve over time; and • Teams should be "whole" in the sense of having all necessary skills internally, so that they are not dependent on people outside of the team.
While DAD is the most explicit about these points, they are also highlighted by the other frameworks and they are well supported by industry experience.It is worth considering, however, whether these desirable team characteristics-smallness, stability, and wholeness-may come into conflict and what happens if they do.While unaddressed by the aforementioned frameworks, situations where this question comes to the fore are easily posited.This is particularly the case in large-scale system development where the competences needed for wholeness are not limited to different software engineering skills (e.g., development, testing, deployment, or data analysis) but include in-depth domain expertise from different parts of the system.Then a team with the right competences for any one particular task may be formed, but what happens when the next task requires a different competence mix?Should the team be enlarged, sacrificing smallness?Should it be reformed, sacrificing stability?Or should it instead be kept intact, but sacrificing wholeness, accepting strong dependencies and coordination with people outside of the team?
This paper investigates the relationship between stable and dynamic approaches to teaming, in the larger context of how stability, wholeness, and smallness of software engineering teams relate to one another.Based on the author's personal experience as both engineer and researcher, having worked on and/or studied large-scale projects in multiple industry segments (e.g., multi-standard telecommunication networks, military aircraft and logistics systems), this is as salient and constant a question as it is unresolved in popular Agile frameworks and literature, suggesting the need for an in-depth investigation into team forming and performance in large-scale Agile development.

| A note on terminology
As in many other research fields, approaches to teaming are discussed with some divergence in terminology.To exemplify, what is described in one source as a "temporary team" 28 is very similar to what others name "task forces" 29 or "dynamic teams." 30Seeking to provide a general solution to this challenge and establish a shared terminology is outside of the scope of this paper, but to avoid misinterpretation, the terms largescale, dynamic, stable, and teaming are used as follows in this study: • A large-scale development context is one of hundreds or thousands of individuals representing a multitude of roles and competences-far more than a typically sized team of 5-10 members allows for.In other words, large scale does not imply large teams, but rather a multitude of teams working in parallel, raising the question of how individuals, responsibilities, and teams map to one another within that larger development context and how that mapping persists or changes over time.
• A dynamic team adapts its membership roster to suit the task at hand.This does not necessarily mean replacing all members all the time, but deliberately rotating out certain members if their expertise is not needed at the moment, and replacing them with new members with skills that are a better fit at the time.This does not imply that team members move on to a different product or business area, but rather stay within their domain expertise to contribute to another team, for example, implementing another feature that requires their competence.Note that this usage of "dynamic" is different from that of members being flexible in the roles they may assume within the team. 31,32A stable team is a team that strives to preserve a stable membership roster.Though no team can be truly stable (e.g., due to employee turnover), stable and dynamic teams are different in that the stable team is not specifically adapted to the task at hand.Rather, tasks are assigned to the team, which is expected to be versatile enough to cope with a variety of tasks without adjusting its roster.This means that what sets the stable team apart from a dynamic one is not the absence of change for any fixed extent of time, but is rather found in how it is formed, its intent, and its relationship to the larger organization.Emphasizing this point, as observed in this study (see Section 4.1.3),there may not necessarily be any conflict between dynamic teaming in this sense and organizational stability in the sense of maintaining long-term ownership and not being forced into reactive reforming of teams.
• Teaming refers to the act of coming together as a team.Consequently, a dynamic approach to teaming, as discussed in this paper, refers to forming and re-forming teams to fit current competence and skill requirements, rather than adapting and splitting tasks to fit the competencies and skills of extant teams.
In addition to dynamic and stable, numerous terms are used by both practitioners and academics to describe various modes of teaming (such as long-term team, project team, and temporary team), with similar divergence in meaning.To minimize ambiguity, this paper restricts itself to the first two terms in its discussion of the subject.In the interest of coverage, however, additional terms are included in the search for relevant literature (see Sections 2.1.1 and 2.1.2).
It should be noted that the terms dynamic and stable are applied regardless of whether the team is physically co-located or not.In other words, this study is agnostic as to a team's remote working practices.Similarly, the study is agnostic with regards to workflow management practices, such as Kanban, 33 within the team.That being said, the potential interplay between dynamic and stable teaming approaches and other team practices is a promising area of further work (see Section 8.1).

| RESEARCH METHOD
In response to the reasoning in Section 1, the following research question was formulated: What are the consequences of stable versus dynamic approaches to teaming, in the larger context of stability, wholeness and smallness of software engineering teams?
To answer this question, a research method of multiple complementary steps 34 was devised.First, related work both inside and outside of the software engineering community was surveyed in order to present an accurate and comprehensive overview both of the current state of research and of attitudes and experiences.Second, the findings from this review informed an interview guide for individual interviews with practitioners from independent companies, having personal experience from working in stable and/or dynamic teams, in order to capture their views on the most salient claims and findings in literature.Third, the results were synthesized, analyzed, and discussed.Finally, the findings were validated in workshops with senior engineers in additional companies, previously uninvolved in the study.This step-wise process is illustrated in Figure 1; the solid edges of the figure represent the sequence flow, whereas the dashed edges represent the flow of information between the steps.

| Study of related work
The subject of effective collaboration in groups and teams is a broad and far-reaching topic with a large body of already published literature.From that perspective, the application within software engineering is merely a small subset, of which the Agile context is in turn an even smaller subset.
It is plausible both that teaming in software engineering has many commonalities with teaming in other domains and simultaneously that there are specific aspects of software engineering that create conditions, restrictions, and/or constraints unique to software teams.For this reason, it is important to study both software engineering literature in particular and literature on teaming more broadly.Furthermore, given how software engineering practices are highly influenced by "gray" literature 35 such as blogs and online discussion forums, a multivocal approach 36 is called for, where a very broad notion of what constitutes relevant literature is adopted.To reflect this, the study of related work was split into three parts: software engineering literature, gray software engineering literature, and literature outside of software engineering.

| Software engineering literature
Influenced by the guidelines of Kitchenham 37 (working from a defined review protocol with a defined and documented search strategy containing explicit inclusion and exclusion criteria), a literature review protocol was created with the intent of establishing the current state of research into dynamic versus stable teams in software engineering.The question driving this review was What are the established consequences of dynamic versus stable teams in software engineering literature in general, and Agile literature in particular?, and Google Scholar was selected as the search engine.
The choice of any search engine comes with pros and cons, though.For this study, the choice of Google Scholar was based on three factors: first, its wider coverage compared to other search engines, although this higher coverage comes with lower accuracy, requiring careful inspection and filtering of the results 38 ; second, ease of access to practitioners seeking to make similar searches for themselves; third, the compatibility of search string syntax allowing the exact same searches to be performed for both academic and gray literature (see Section 2.1.2).That being said, it is important to be aware of potential biases in the chosen search engine, discussed further in Section 7.2.
As recommended by Wohlin, 39 multiple sets of synonyms were searched to capture sources using different terminology and possibly written in the context of separate subcommunities.To this end, two sets of synonymous or related phrases were constructed: {"agile", "software development", "software engineering"} and {"team longevity", "stable team", $long-term team", "task force driven", "project team", "temporary team", "dynamic team"}.The two sets of phrases were then combined to generate a total of 21 search strings (e.g., "team longevity" AND "agile").
For each of the 21 searches, the following steps were followed: 1. Collect the top 10 results.
2. Exclude duplicates, re-publications of the same work, and non-academic results (e.g., white papers).This step is particularly important in mitigating the relatively low level of accuracy of Google Scholar. 38 Based on analysis of abstracts, exclude papers deemed irrelevant to the question informing the review protocol.

Conduct in-depth analysis of remaining papers and books.
The rationale for limiting the results to 10 per search was twofold.First, though the number of results per search varied from the low tens to the tens of thousands, pre-review experimentation revealed rapidly diminishing relevance after the first ten results.Second, it produced a total of 210 (non-unique) results, which was deemed a data set that, while not exhaustive, was sufficiently large to be representative and yet of manageable size.
All sources preserved in the final step of this process are presented and discussed in Section 3.1, with the number of papers per search string presented in Table 3.

| Gray software engineering literature
To provide an overview of the discussion within the software engineering community (but outside of academic circles), a similar protocol was formulated.Due to the more anecdotal and/or opinion-based nature of gray literature (see Section 3.2), this was phrased differently from that of the previous search protocol: How is the distinction between stable and dynamic teams in software engineering in general, and in Agile methodologies in particular, presented and discussed in gray literature?First, Google was searched using the same search strings as for Google Scholar (see Section 2.1.1),whereupon the top 10 results for each string were manually reviewed, preserving non-academic sources deemed relevant to the question (that is, discussing the consequences of dynamic versus stable teams in software engineering).These were then aggregated and duplicates were removed, which were analyzed in-depth (see Section 3.2).

| Literature outside of software engineering
Rather than attempting a comprehensive literature review of a very extensive field, the study of literature outside of the software engineering community took as its starting point the in-depth analysis of the current state of the art by the National Research Council, 40 according to the following steps: 1. Study the National Research Council's Overview of the Research on Team Effectiveness.
2. Study selected primary sources cited in the overview, as deemed relevant to the research question.

| Practitioner interviews
The study of related work was followed by interviews with practitioners, designed to complement the views and findings expressed in literature and giving engineers with experience from stable and/or dynamic teams in large-scale contexts the opportunity to share their perspectives on these views.
To this end, a semi-structured interview protocol was created, with questions designed to be behavior/experience oriented, according to the guidelines proposed by Hove and Anda 41 (see Table 1).Questions IQ 1 À 6 were phrased to capture experiences of both types of teams and allow the interviewees to reflect on what worked well, and what worked less well with each setup, while emphasizing the large-scale context of the study.IQ 7 was based on the reasoning in Section 1, and sought to capture the practitioners' views on whether these three team properties were indeed feasible, while IQ 8 À 9 were designed in recognition of team performance as a "multi-faceted construct" 42 and of teaming as a highly subjective experience, seeking to complement the varying claims with regards to both performance and emotional well-being found in literature with the personal experiences of the interviewed practitioners.
T A B L E 1 Practitioner interview questions.

Interview question
Focus A total of 20 interviews-a number chosen to be in excess of the 10-15 after which the number of uncovered concepts tends to drop sharply, 43 with some margin for mortality-were planned, of which 19 were successfully concluded.The interviewees were purposively sampled for qualitative data appropriateness and selected as good informants, 44 representing a wide range of roles (9 developers/testers, 2 customer support engineers, 3 senior engineers/architects, 2 line managers, 2 project managers, and 1 team leader*), with the population being equally split across two independent companies (in excess of 10,000 employees each) pursuing large-scale software development in separate industry segments (defense and connectivity services, respectively).The transcribed responses were read back to the interviewees during the interviews to ensure correct interpretation by the researcher.

| Analysis and discussion
To generate insights from the interviews, the transcripts were thematically coded and mapped according to the thematic coding analysis guidelines presented by Robson 44 : • Familiarizing with the data: The transcripts were read and re-read, with initial ideas noted down.
• Generating initial codes: Particular statements of interest across the entire data set were identified and systematically coded.
• Identifying themes: The codes were grouped into candidate themes, which were then checked against codes and other themes to ensure accuracy and avoid overlap.During this process the themes and codes were iteratively revised and refined.
• Integration and interpretation: The data was structured in a thematic map to support subsequent analysis, and the quality of the map was re-evaluated.
This thematic coding analysis was conducted iteratively to increase the quality of the analysis, resulting in a thematic map of statements and reflections made by the interviewees from the two companies.Even so, there is always the threat of researcher bias affecting the analysis, particularly in single researcher scenarios-a threat discussed further in Section 7.2.The outcome of the analysis is presented along with an overview of the interview results in Section 4, and then discussed in relation to findings from literature in Section 5.

| Validation workshops
In the interest of generalizability, the conclusions from the analysis of literature and interviews were presented to senior engineers from two additional companies, pursuing large-scale development in unrelated industry segments (logistics solutions and visual surveillance, respectively).This took place in a workshop setting, with one workshop of between 6 and 10 participants per company (following the guidelines on focus group size by Morgan 45 ).The workshop participants were purposively sampled by the respective company contact persons, without involvement from the researcher, for experience from and interest in teaming and organizational design as well as diversity in roles.In these workshops, the study findings as outlined in Section 5 were explained by the researcher, whereupon the participants were asked to reflect on a set of questions (see Table 2).Meanwhile, the researcher took notes on a shared screen, allowing all participants to verify their correctness in real time while avoiding interruption of the interaction between the participants (as per Kitzinger's guidelines 46 ).
Each of the workshops lasted 3 h, with approximately 2 h dedicated to presenting and informally discussing the study and its conclusions, and 1 h spent on the questions in Table 2.The result from the validation workshops is presented in Section 6.

| STUDY OF RELATED WORK
As noted in Section 2.1, it is important to consider not only software engineering literature on teaming, but teaming more broadly.Additionally, software engineering practice is largely influenced by non-academic sources.Consequently, this section discusses results from the study of T A B L E 2 Workshop validation questions.

WQ 1
Based on your experience, how relevant do you find the question of dynamic versus stable teams in software engineering?Relevancy WQ 2 Based on your experience, how accurate do you find the study conclusions to be? Accuracy WQ 3 Based on your experience, how generalizable do you believe the study conclusions to be? Generalizability WQ 4  How useful do you find the study findings to be in guiding teaming practices in your organization?Applicability academic sources within software engineering, gray software engineering sources, and teaming literature in general (following the same structure as laid out in Section 2.1).

| Software engineering literature
The review protocol described in Section 2.1.1 yielded a total of 26 unique papers, which were analyzed in-depth.Table 3 displays the yield per search string (each cell representing a unique combination of phrases, e.g., "team longevity" AND "agile").Note that some papers occurred in multiple result sets, making the sum of all cells larger than the final set of analyzed papers.
An overview of overarching themes identified in the review along with the sources addressing them is provided in Table 4, along with a more verbose description below.
Generally speaking, there is a scarcity of studies on the correlation between longevity and other team dimensions, 16 and even though stability over time may allow teams to improve social integration (e.g., familiarity, 17 coordination, 47 and understanding of team members' expertise 48,49 ), "there remains a paucity of studies investigating [the impact on] team outcomes." 17That being said, the problem of team members potentially becoming "idle" if their skills do not match the team's needs at the moment 52 and the usefulness of structures supporting dynamic teaming within the enterprise have long been recognized. 53e studies that do investigate this impact are inconclusive. 61Some find positive effects associated with stability (e.g., perceived customer value 62 ) while noting that further studies are needed.Rejab et al 49 note that team stability means that competence is not lost due to attrition, but the other side of that argument is that a stable team misses out on the potential infusion of competence from new members.Meanwhile, others point to potentially negative effects of team longevity (e.g., reduced technical performance and suppression of potentially disruptive ideas 63 ) and recommend "some rotating to shake up team stability" in order to mitigate its negative effects. 64On a similar note, Prikladnicki et al 50 find in a large case-study that "the best teams are often those whose members have never worked together."As Ancona et al. reflect, this is not necessarily in conflict with other findings showing that team stability can have a positive impact on social integration in the team, since performance is a multi-faceted construct influenced directly and indirectly by numerous factors. 42Following this line of reasoning, it is not surprising to find sources suggesting both positive and negative consequences of team stability.
A notable exception to the rather cautious statements in most sources are Crowder and Friess, 65 who state that "teams with stable memberships are vital" and that "if the team has to integrate new members, efficiency will always suffer," albeit without empirical support for these Overarching themes identified in software engineering literature.

Theme Sources
Social and/or psychological effects of team stability Chiocchio et al., 17 Ji and Yan, 47 Rico et al., 48 Rejab et al., 49 Prikladnicki et al., 50 Britto et al. 51 Supporting structures and task matching Lee et al., 52 Taylor, 53 Grass et al., 54 Dusenberry et al. 55 Team coordination, formation and onboarding Berntzen et al., 56 Britto et al., 51 Ancona and Caldwell, 42 Prikladnicki et al., 50 Gregory et al., 57 Garnier et al., 58 Gregory et al., 59 Buchan et al. 60 Impact of stability on team outcomes Zhou et al., 16 Chiocchio et al., 17 Ça glayan et al., 61 Cavalcante et al., 62 Rejab et al., 49 Fang, 63 Dayan and Benedetto, 64 Prikladnicki et al., 50 Ancona and Caldwell, 42 Crowder and Friess, 65 Sosa and Danilovic 66 Impact of composition and roles on team outcomes Gorla and Lam, 67 Jiang and Klein 68 claims.Taking a less uncompromising view, Berntzen et al 56 note that large-scale Agile projects will always imply coordination challenges and that "temporary team arrangements" is one of several coping mechanisms to handle this fact.This is in line with the finding that one of the most valuable contributions of team members is their "being part of clusters of networks, potentially creating more cohesion among individuals who might subsequently collaborate on projects." 50Further supporting the view that stable teams may be desirable but impractical, Britto et al 51 speculate that adding and removing team members may hinder "the cultivation of psychological safety" within teams, but find in a large case-study that management is unable to keep teams intact in perpetuity: particularly while ramping up and forming new teams, without experienced members these teams would be severely challenged, making the "forking" of experienced teams the preferred method of competence seeding.Even though this will inevitably break up already established teams, the viability of such an approach is supported by the finding that tenure heterogeneity within the team correlates with clearer goals and priorities 42 that "it's best to have a mix of members who have and haven't worked together" 50 and that task-oriented interaction between people is a highly creative process which can be augmented in dynamic team setups. 66her sources regard members joining and leaving teams as a natural part of self-organization, allowing the team to adapt as conditions change 69 and identify factors such as team context, support, and leadership 54 as factors with considerably greater impact on outcomes than team longevity. 55In studying challenges to team onboarding, Gregory et al 57 find that the majority of challenges are related to onboarding developersparticularly junior ones-who are unfamiliar with Agile approaches and need to adjust to the mindset of Agile principles, rather than any disturbance of the team dynamics as such (in a similar vein, adding new members who lack the appropriate skills will not improve team performance 58 ).
They also identify multiple activities and principles that support onboarding of newcomers to a team, 59 which suggests that teaming should be considered an organizational capability in its own right and that it is a process that may be either fast or slow depending on how actively it is supported.This view is further strengthened by the finding that a variety of onboarding techniques are used in practice, but that they vary in effectiveness 60 and that the right composition-not just in terms of skills, but of matching personalities-is important to team performance 67 and that the most prominent risks to outcomes stem from lack of expertise and unclear roles in the team. 68 summary, it would appear that while team longevity clearly has an effect on multiple team characteristics and ultimately on team outcomes, studies of these effects are seemingly contradictory and suggest that their direction can be both positive and negative.There is also reason to believe that these effects are overshadowed by those caused by other factors, such as context, organizational support, and leadership.In other words, team longevity is not the deciding factor it is sometimes portrayed as within, as Philippe Kruchten dubs it, "the Agile memeplex." 1 Furthermore, this implies that turning groups of people into high-performing teams is a function of organizational capability-a capability that can be cultivated and improved-more than it is a function of time.

| Gray software engineering literature
Although the academic literature on stable and dynamic teams is limited within the software engineering community, the question of long-lived teams is also debated in blogs and articles.While both sides are argued, the number of sources claiming the benefits-if not the necessity-of stable teams are in a majority.Following the protocol outlined in Section 2.1.2yielded a total of 22 unique sources.Table 5 displays the number of relevant results per combination of terms (e.g., "team longevity" AND "agile").
Stable or long-lived teams are recommended because acquiring new skills as necessary is claimed to be a faster process than becoming a high-performing team with a sense of psychological safety 70 -an established important factor in team productivity-and that effective processes and know-how emerges over time in stable teams. 71Some take the stance that when following Agile methodology, teams are by definition stable, 24 because their members will not move "just because there's a demand in a different area of the business," 72 and it is argued that they are more predictable 73 and produce higher software quality, 74 that their members feel happy 75 and secure 76 and are motivated to streamline their work, 77 and that they are better at staying focused on the product vision. 78A B L E 5 Number of relevant gray literature results per search query.Other proposed benefits of stable teams include less task switching due to working on one task at a time, 70 increased understanding of one's industry and context over time, 79,80 and increased understanding of the customer over time. 79Similarly, handing off a developed solution to a maintenance team is identified as an anti-pattern. 81It is worth noting, however, that these claims stem from a hidden assumption that switching teams equates switching product, customer, and/or business contexts.Even though this assumption is prevalent in the studied gray literature, † it is one that does not hold for the type of large-scale development effort that is the focus of this study, where a multitude of teams can work in parallel on the same product and interacting with the same customers in the same business context, albeit in different capacities.In other words, one might assume that these claimed benefits are not necessarily related to a stable team per se, but rather working on a particular product, within a given domain or with a type of customer for an extended period of time or from focusing on one task at a time-regardless of teaming paradigm.
Focusing on the other side of the coin, others argue against reallocating team members after completing a task because "other currently performing teams" would become unnecessarily large with the new additions and cause them to regress in Tuckman's model of team development (discussed in Section 3.3), 82 with Tuckman's stages viewed as "inevitable" in any group.
Meanwhile, others question whether "the benefit of team longevity [might] be overrated," 83 that there is a "potential for groupthink" in stable teams, 77 and that people might actually want to switch teams "for interpersonal reasons." 77Proponents of dynamic teams make the case that there should never be more people assigned to a task than strictly necessary: "if a team is comprised of more people than is necessary to complete the unit of work, then the team is creating waste simply by existing" 29 and that temporary team members are particularly useful for utilizing specific expertise. 84Furthermore, it is argued that team productivity is less about longevity than it is about creating the circumstances in which the team can excel, such as autonomy, clear goals, infrastructure and direction. 28A striking feature of the sources arguing in favor of dynamic teaming practices is that they are all relatively recent: Of the 22 studied gray literature sources, 11 were published in 2020 or later.Of these, 7 took a positive view of dynamic teams.Meanwhile, of the older sources published in 2019 or earlier, not a single one argues in favor of dynamic teams, strongly suggesting a shift in attitudes within the software engineering community.It is worth noting that this apparent shift in attitudes coincides with the COVID-19 pandemic and the paradigm of remote work that followed in its wake, although it is unclear whether there is any causal relationship.
Among these sources arguing for the benefits of dynamic teams, one reports very positive experiences in terms of employee satisfaction and cross-pollination of ideas from forming small, temporary "mission teams" out of larger talent pool, 30 while another notes a trend towards teams that "develop, change, and disband far faster and more fluidly than formal group reporting structures" and discusses the enablement of such teams. 85A "cultural shift [towards] teams that do not exist on an organizational chart, but rather are naturally formed […] to get work done" is claimed, 86 while pros and cons of long-lived teams and dynamic teams are actively compared and debated in discussion forums. 87,88 is worth pointing out that there is a high degree of disagreement on terminology between sources.To exemplify, the term "project team" was included in the search, as several sources point to them as the antithesis of stable Agile teams, but characterizations of project teams diverge, and some sources describe "agile project teams" similarly to stable teams. 89That being said, searches for "project team" returned few results relevant to the topic, with most of the top results discussing typical software engineering roles in very general terms.Illustrating the differing interpretations of the very concept of Agile, many sources stress the importance of stable teams to excel in an Agile context, 90 while others reason that agility instead implies responding to a "volatile business context [with an] increasing prevalence of temporary teams" 91 and that with the right leadership and structure, a dynamic team can work well. 92In a similar vein, the argument is made that being agile is to adapt to changes, and that may require reorganizing teams, so long the people involved have a say in it. 87 summarize, the majority of sources claiming the importance of stable, long-lived teams do so based on two main perspectives.The first is that of psychological and collaborative effects, where stable teams allow members to become familiar with one another, develop trust and psychological safety.The other perspective is that of domain expertise: Switching teams is understood as also switching product or industry segment, inhibiting the build-up of competence over time.This latter line of reasoning is based on the unspoken assumption of a relatively small-scale context where only one or a small number of teams work on any given product and that joining a new team also requires familiarization with a new code base, new technology, or possibly a new customer segment.That is not the context considered in this paper, however.On the contrary, this paper concerns itself with large-scale development with numerous teams working in parallel within the same system domain, allowing individuals to work on the same technology and code base, serving the same customers, regardless of which team they are part of.
It should also be noted that the sources studied in this review are no exception to the fact that gray literature is often opinion based 36 : Most of the claims made are supported by anecdotal evidence, if supported at all.

| Literature outside of software engineering
This section provides a brief overview of the related work on team performance and longevity outside of the software engineering field.An important caveat to keep in mind is that these findings may be subject to contextual factors that differ between software engineering and other fields.
The work of Tuckman, 93 showing how teams tend to progress through distinct stages, has been highly influential and has formed the basis of the wide-spread notion of teams "forming, storming, norming and performing" often referred to in an Agile context 94 in support of long-lived teams.Proactive leadership and context-setting has been shown to play a critical role in facilitating this process, however, 95 suggesting that the journey from forming to performing is less about spending some fixed duration of time together as a team than it is about conducive team composition, leadership and context.That being said, as Rickards and Moger 96 point out, the forming to performing model of team development fails to account for important aspects of team performance-particularly in a creative research and development context-such as the team mechanisms that lead to failure or to outstanding performance, respectively.
A substantial body of research has been produced since the work of Tuckman-too much to give justice to in this paper-with many studies presenting partially conflicting results.A summary of the research literature on team effectiveness by the National Research Council (NRC) 40 (studying the problem from a team science point of view) highlights three salient factors in the literature: team composition, team professional development ("knowledge-building activities [which] enhance collaborative problem solving and decision making"), and team leadership.This overview also highlights the factor of psychological safety, sometimes referred to in support of stable teams but finds that "the research suggests that appropriate team leadership is a promising way to promote psychological safety," rather than longevity.The importance of leadership (providing support and respect) is further emphasized by subsequent work, 97 identifying hierarchy, leadership, and type of work as potentially important factors for psychological safety. 98Similarly, in the NRC overview, team cohesion is not found to be primarily related to the stability of a team over time: In fact, "although team cohesion has been studied for more than 60 years, very little of the research has focused on antecedents to its development or interventions to foster it," although team composition and leadership, 40 as well as the skills of the team members, 99 appear to be important factors.Indeed, team performance has even been found to correspond negatively with longevity, as communication and monitoring of the team's environment begins to decline after a few years. 100y Edmondson argues that teaming, 101 rather than teams, is increasingly becoming the critical organizational characteristic-particularly in situations that are "complex and uncertain," and that "to excel in a complex and uncertain business environment, people need to work together in new and unpredictable ways." 102At the same time, team longevity can lead to "familiarity and understanding," to be weighed against the opportunity to apply "the highest level of expertise to each and every project." 103

| Summary of related work
In summary, there is a scarcity of studies on the effect of stability on team outcomes, and the effects reported are both positive and negative.
That being said, stability has been shown to improve psychological safety and team cohesion, with the caveat that it is unclear how significant this effect is in comparison with other identified factors such as leadership, skills, and composition.Meanwhile, gray literature sources have historically been strongly pro-stability, but in recent years, there is a tendency to highlight the benefits of dynamic approaches to teaming.Such benefits are centered on team outcomes (e.g., cross-pollination, innovation, and creativity), while benefits reported in support of stable teams are largely social (e.g., psychological safety, familiarity, and cohesion).

| INTERVIEW RESULTS AND ANALYSIS
Of the 20 planned interviews, 19 were carried out, with one interviewee withdrawing for personal reasons unrelated to the study.As the number of interviewees had been chosen to allow for some mortality (see Section 2.2), it was decided to not risk delays by seeking a replacement.
All of the interviewees were highly experienced, with the majority having in excess of 10 years of experience from industry software development.They covered a broad spectrum of roles, including customer support engineer, team leader, project manager, line manager, tester, and systems analyst, with the most common role being that of software developer.Of the 19 interviewees, 10 claimed to have worked mostly in stable teams, while 6 had mostly worked in dynamic teams and 3 a roughly equal mix of both.These stable teams were all based on the ownership of a specific system component, where they were provided requirements on that component to implement, verify, and integrate into the larger system.Given the large-scale and long-lived nature of the studied cases, the majority of these teams had existed for at least a year and often much longer than that.
The focus of the dynamic teams was more varied.In one company, dynamic teams were almost exclusively used in a reactive sense, to address some unplanned incident.Such incidents generally took one of two forms: either to inject extra resources where the project was falling behind or to solve integration difficulties resulting from insufficient coordination and communication between the stable component teamsdifficulties described as being the rule, rather than the exception.In the other company, dynamic teams were mainly used to take long-term ownership of specific non-functional system characteristics (e.g., system availability), with members drawn from and released back into a competence pool of engineers with experience from all relevant system components.‡ Several of the interviewees described examples of such teams where some members would stay with the team for a longer duration, or even permanently, thereby providing a stable core.Such members might serve in a product owner role, providing the team with long-term direction, or provide expertise within a critical discipline.Meanwhile, other members would rotate in and out of these teams at a faster pace, providing expertise within the product domains that needed to be modified at any given time.

| Overview of themes
The interviewees were invited to spontaneously reflect on experienced advantages and disadvantages of dynamic and stable team setups, without suggestions from the interviewer (see Section 2.2).This yielded a total of 24 subthemes under the larger themes of advantages and weaknesses.To focus on the most salient results, any subtheme spontaneously identified by less than 30% of the practitioners was culled, resulting in a smaller set of 9 subthemes.These are labeled A-J and color-coded as per larger theme in Figure 2 and described in greater detail below.
Even though all interviewees reflected on both advantages and weaknesses of either teaming approach, the number of statements per theme makes it clear that they generally viewed dynamic teams more positively than stable teams: As visible in the figure, 7 of the 9 most frequently identified subthemes were related to either weaknesses of stable teams or advantages of dynamic teams.Below, their views on each of these most salient subthemes are presented in the context of the larger theme it is included in.This is followed by an overview of the factors found important to team performance and rewarding teaming experiences, respectively, by the interviewees.

| Advantages of stable teams
The one salient advantage of stable teams identified by the interviewees was familiarity with one's team members (subtheme A in Figure 2).It was explained how in a stable team everybody's individual skills as well as communication style is known, and team members can find ways of interacting that work well for them.As one practitioner put it, "I like how you can gel and get to know each other, how you function at work, what others know and don't know.You learn to trust each other; I find that comfortable."This notion of being comfortable and-as others put it, safein a stable team was a recurring theme in the interviews, with practitioners explaining how "you fall into a pattern," "you have a safe foundation to stand on," and "you know [which team mate] to ask" when you need help.Some noted that this is highly dependent on one's personality, and this was indeed visible among the interviewees themselves: To some, it is relaxing to "not have to get to know new people," whereas others find the same situation stifling.
The subthemes addressed by at least 30% (6 or more) of interviewees.

| Weaknesses of stable teams
The interviewees identified four main weaknesses of stable teams: stagnation; cross-team collaboration challenges; productivity and know-how; inflexibility and inability to adapt to changing needs (subthemes C, F, G, and H, respectively, in Figure 2).
Related to the double-edged nature of not having to work with new people (see Section 4.1.1),the most commonly identified weakness of stable teams is that they tend to stagnate over time.Interviewees related how stable teams "don't manage to develop competence as fast [as a dynamic team]," how members "get stuck in a box where you do the same thing over and over again" and that "you become closed to new ideas and ways of thinking.Such stagnation is problematic, it was explained, because "you need to be both innovative and creative to achieve complex problem solving." For nearly all of the interviewees, the stable teams they had worked with had a clearly delegated ownership of a specific part of the larger system (e.g., a component).One of the major subthemes in the interviews, however, was how such teams had trouble collaborating: Actual customer outcomes and use cases rarely impact only a single component, causing the need for multiple teams-often a dozen teams or more-to not only collaborate but also synchronize their work.This was described as a cause of local optimization (faults "being ping-ponged between teams responsible for different domains" and "everybody working on their own part" while "constructing their own reality") and inefficiency ("the synchronization between groups is sluggish and "costly in friction and communication," and "a lot of energy is spent just trying to piece the parts together").Such behaviors reportedly led to everything from delays to direct customer feedback on the incoherence of the overall systemproblems that the engineers from both companies explained were usually resolved by forming temporary dynamic teams of individual members from the many stable teams, who would then be given the holistic responsibility to integrate and deliver the customer outcome: "it has become obvious that we lack coordination between the teams, but [then we] create a new team with individuals from each [stable component] team."It is worth noting that such coordination and integration troubles are partly what continuous integration and delivery practices are promoted as a solution for, but as noted in previous work [104][105][106] scaling such practices is challenging.The experiences related by the interviewees in this study all pertain to large-scale development efforts with sophisticated continuous integration pipelines and high degrees of test automation, which nonetheless fail to fully resolve the coordination and integration challenges at a system level.
It was also stated that stable teams can have a negative impact on overall productivity and know-how within the teams.Several interviewees claimed that as a rule, stable teams encourage members to over-specialize in a narrow niche rather than developing generic competencieseffectively becoming over-fitted to a very small domain.This hurts productivity, because in reality, new tasks assigned to the team can be difficult to match to these limited competence profiles: "you get stuck competence-wise," as one interviewee put it, and "it becomes very hard to allocate resources efficiently."This can be thought of as a challenge to fostering "T-shaped engineers," often claimed to be of paramount importance to successful interdisciplinary collaboration and innovation. 107As one interviewee explained, this generates a disconnect between actual competence needed to get the task done (which is typically possessed by one or a few clearly identified individuals) and the planning and assignment of tasks (which only deals with stable teams as monolithic entities), causing inefficiency and friction in the organization.These observations can be interpreted as the lack of a functional ecocycle in these over-specialized stable teams.Such an ecocycle features a creative destruction phase to allow regeneration and successful adaptation to changing circumstances.These concepts, while long applied to economics, have recently been discussed by Helfand in the context of teaming. 108lated to this, the fourth salient subtheme of stable team weaknesses was the inflexibility of the teams themselves (as opposed to their members).Interviewees described how shifting demands in the project might require more work in certain components today, but less tomorrow, causing some teams to run out of meaningful work to keep them occupied.This would sometimes cause engineers to "invent" work within their own domain, especially as members of stables teams "could have some issues adapting" to such changes.Exacerbating this problem, line managers might be reluctant to "let their resources go" so that they may contribute where the need was currently greatest.

| Advantages of dynamic teams
Three subthemes of dynamic team advantages were identified by the interviewees: competence and knowledge sharing; flexibility and adaptation to changing needs; personal development, learning, and networking (subthemes B, E, and J, respectively, in Figure 2).
The practitioners reported how dynamic teams work very well for upskilling, being exposed to good practices of others and establishing a shared company-wide culture and methodology.As one interviewee related, "There was significant added value in bringing in competent people who understood their product but were also able to collaborate [with us in the dynamic team setup].Suddenly you had formed a competence pool, and when you aggregate that competence to a greater whole you reach a higher level of intelligence." The ability of an organization based on dynamic teams to adapt to changing needs was also highlighted as a strength, noting, for example, that "you can hand-pick the people needed here and now," "you can request the competence you need at the moment," and, perhaps counterintuitively, that dynamic teams provide stability to the organization: As stable teams were often found unable to cope with changing circumstances (see Section 4.1.2),they would result in repeated temporary arrangements layered on top of the supposedly stable teams.In comparison, teams designed to be dynamic from the outset avoided "starting from scratch" every time this inflexibility ended in the creation of an ad hoc solution.
The personal development, learning, and networking was highlighted as an advantage of dynamic teams.As one interviewee explained, "in the beginning it was a bit challenging, because all the members were new to me [but] later that was great, because I acquired contacts from all across [the company] and learned new skills."Another interviewee described how this networking benefit "sticks" in that the contacts they acquire are useful for years, long after their formal responsibilities have changed: These contacts will still be knowledgeable of the subject and can also help identify up-to-date contacts.
Finally, it should be pointed out that though dynamic teaming was claimed to be advantageous to personal development, learning, and networking, it is by no means the only method to achieve them (nor are these claims reason to believe that it would by itself be sufficient in every circumstance).

| Weaknesses of dynamic teams
The weakness of dynamic teams identified by the interviewees was the delay in becoming effective (subtheme D in Figure 2).It was noted that "one must expect some ramp-up time" when modifying a team because "you don't know very well what the others can and cannot do" and that it is important to get up to speed quickly, which places stringent demands on the team members.Several interviewees remarked that joining a new team is not necessarily the same as entering a new technology domain: Learning a new domain can take months or even years, whereas becoming productive in a new team (while still working within one's area of expertise) should be a matter of days.Several practitioners remarked that there are significant individual differences; however: "it depends on how you are as a person; how easily you adapt."It was further noted that while long ramp-up times are a problem, it is also a question of practice: "The [ramp-up] time mustn't be too long, because then dynamic teams become very difficult, but if you rotate often people get used to changing teams."

| Team performance and rewarding experiences
The factors contributing to team productivity and rewarding teaming experiences were treated as separate themes.This is partly a consequence of the fact that they were addressed by separate interview questions (see Section 2.2), but primarily due to the fact that the interviewees themselves thought differently about the two: what leads to high performance may or may not be what makes for a rewarding experience.
The two dominant subthemes were purpose (highlighted by 11 interviewees) and composition and skills (highlighted by 13 interviewees).The interview statements indicate the importance of being "able to pick the right team," that "the team is the sum of its members," and that "by far the most important factor is the members themselves and their competence."Beyond having the right competence and skills in the team, it is widely agreed that it must be "clear what is expected to be delivered when" with "clear responsibilities," creating "a clear vision and mission" for the team.Several interviewees reflected that this is particularly important in a dynamic team, as a stable team can often find their own work to keep themselves occupied if left without a clear goal-for better or worse (see Section 4.1.2).
An interesting observation is that while most interviewees agreed on these factors of high performance, there was an even split in the population with regards to rewarding experiences.The two largest subthemes were communication and relationships on the one hand and achievement and feedback on the other.Each of these was highlighted by 8 out of 19 interviewees as important to a positive teaming experience, with only one interviewee highlighting both.In other words, nearly half of the interviewee population claimed to be motivated by the achievements of the team, while an equal number felt motivated the people in the team-with very little overlap between the two groups.This is fully in line with the claims that personal preferences vary, implying that a multi-modal approach to teaming may be valuable.

| DISCUSSION
This section discusses the implications of the findings from both literature and interviews.

| The team as the organizational building block
The importance of teams in achieving any nontrivial task is widely recognized and is not in question, but teams can vary in their structure, purpose, and reason for formation.It is valuable to consider the relationship between the team, its output, and the larger organization when thinking about stable and dynamic teaming.Teams as stable constructs are arguably a logical conclusion of the type of hierarchical line organization that is common in the industry.In this sense, the team serves the purpose of a conveniently sized container to sort individuals into, which is not necessarily in agreement with the team as a group of people coming together to jointly achieve a shared goal.Indeed, the interview results imply that there may be an inherent tension between these two roles of the team: as a persistent organizational building block on the one hand and as a group formed for joint action on the other.
Several interviewees speak of over-specialization and micro-silos as a common features of stable teams and that the members of such teams almost by definition end up working on multiple parallel activities, either individually or in smaller subteams, as the demands of their tasks shift over time.This is in line with previous work on mob programming, 109 where one member of a stable component team reflected that "We call it a team, but it is really a financial construction, a number of employees who share a room."When this team adopted mob programming to better collaborate and jointly solve one task at a time, they came into conflict with the rest of the organization: While the team ostensibly existed to allow its members to work together, the de facto expectation was for them to multi-task and work on multiple conflicting priorities in parallel.The adoption of mob programming forced them to come together and stop multi-tasking as individuals, revealing this underlying disconnect between the organization's ostensible and de facto expectations on its teams.
In this context, it is worth recalling Conway's law, 110 which points out that the structure of the organization influences the resulting system architecture-a finding supported by the interviewees in this study, stating, for example, that in stable team setups there's a strong tendency to "become protective" and "you're not allowed into each other's [code]," which makes the "code different from team to team."But regardless of how teams are organized-large or small, stable or dynamic-they must always communicate and coordinate their efforts, and although architecture and organization often correlate, it has been found that architecture is not necessarily a good predictor of communication and coordination needs in a software project. 111The "traditional" technique of modularization has been identified as insufficient 112 ; instead, task-level alignment of coordination and communication is identified as a path to more effective software development, 113,114 further supporting the benefits of dynamic teams spanning architectural boundaries, as opposed to stable teams aligning with them.This is particularly the case in a nondeterministic development effort, where the ultimate system anatomy is not known up-front, but emerges and evolves as development progresses-as, indeed, advocated in Agile methodology. 115At the same time, recent years have seen an industry trend towards "the reverse Conway maneuver," 116 seeking to form teams that will produce the desired architecture, an approach that in a paradigm of evolutionary system anatomies implies similarly evolutionary team structures.

| The iron triangle of teaming
As noted in Section 1, it is frequently claimed that teams should preferably be small, stable, and whole, but the interviews clearly show that in the experience of the practitioners, such teams are not feasible in large-scale development.Fifteen of the 19 interviewees stated that such teams are simply not realistic for anything but small, isolated tasks: "creating such teams is impossible," "it doesn't work; we have tried," and "you end up requiring superman employees [who] can do everything."The interviewees' experiences show that even if they don't believe in achieving all three of these desirable properties, the teams they have been part of have reliably achieved two of them: • Teams that have been small and stable, but not whole.These are the archetypal stable component teams.They are highly dependent on team-external support and/or need tasks to be broken down into very small subtasks, violating the principles of team ownership and accountability.
• Teams that have been stable and whole but are very large.Such teams may number in the dozens or hundreds, at which point the label "team" is questionable.
• Teams that have been small and whole, but whose membership roster changes over time.These consist of the smallest possible dedicated group with the right competencies for any given task at any given time.
The defining feature of the tension between the three desirable properties, as described by the interviewees, is this mutual dependency between them, were none of them can be prioritized without impacting one or both of the others.A very similar dynamic, albeit in a different domain, is described by the Iron Triangle (also known as the Triple Constraint) of project management. 117This model is colloquially understood to stipulate that a project can be on budget, be on time and achieve its full scope-but not all three, as each of the three constraints interacts with the other two.Similarly, the three constraints of stability, smallness and wholeness each interact with each other in teaming (see Figure 3), suggesting that an analogous model may be useful to help practitioners understand and manage the inherent conflicts of interest in teaming.
Such a triple constraint implies the importance of a careful and deliberate choice: Are small, whole, and/or stable teams preferable in any given situation?In contrast, the interviews indicate that such choices are often anything but deliberate.Rather, the small and stable team is the default option, and any alternatives to this default are rarely if ever considered.As one of the interviewees explained, they were happy to participate in the study as even though they found it very important, in 30 years of industry experience, they had never seen this question be taken seriously by their organization.
If achieving all three of the desired properties is indeed unfeasible in a large-scale development effort, as most of the interviewees believe, what to prioritize becomes a pressing concern.Each of the three properties has clear benefits.
• Team smallness has long been found to be important, with small teams being less prone to cause faults while being more efficient at the same time. 118Similar findings have been reported since the '70s. 119The importance of team wholeness is illustrated by the interviewee responses (see Section 4.1), as its absence causes coordination challenges, delays, and lack of accountability at a system level.These claims are also supported by sources in literature (see Section 3).
• Team stability, on the other hand, appears more situational.Its main benefits reported in literature (see Section 3) are psychological and subjective and may vary depending on personal preference.At the same time, the importance of organizational support and leadership in fostering for example, psychological safety is highlighted in literature, with teaming being consequently identified as a key organizational capability.In other words, team stability may be thought of as one method (but not the only method) to achieve psychologically safe and comfortable working environment for some (but not all) employees.
In this context, it is worth noting that in contrast to the subjective nature of the arguments in favor of team stability, the arguments in favor of dynamic teaming are predominantly related to their performance: their ability to take system level ownership, to focus on what really matters, to build competence quickly, and to deliver rapidly and efficiently.The interviewees' focus on team composition, organizational support, leadership, and attitude of the team members also supports the claims found in literature that effective teaming is a capability that the organization can cultivate and enhance.Following that line of reasoning, it is plausible that the psychological safety-widely agreed to be highly important-must not necessarily be a consequence of team internal circumstances but may be created through proper support and leadership in the larger organization.If that were the case, it is possible that the downsides of sacrificing teaming stability (i.e., potential insecurity and turbulence) can be mitigated.
On a final note, it is worth considering that in practice, team stability and dynamism are not binary states.Rather, they exist inside a continuum, where teams may be more or less stable or dynamic for a variety of reasons.Even a supposedly stable team may undergo changes due to what we might think of as external factors, for example, employee turnover or projects being discontinued.The key difference is the extent to which they change due to internal factors, that is, shifting needs of the tasks at hand.It is also not necessarily the case that the entire team, or all competence profiles, are equally dynamic or stable in nature.From the interview responses, dynamic teams are in multiple cases reported as comprising a core of long-term members and an outer layer of domain experts who come and go.This inner core is made up of individuals with a long-term commitment and expertise (such as product owners, test automation experts or customer relations experts) independent of which product domain (e.g., component) that happens to be impacted at the moment.The individuals who rotate in and out of the teams at a faster pace, on the other hand, do so because of their expertise in those product domains.While the interview data in this study should not be overinterpreted, such a pattern would have significant implications for talent management, as it requires complementary competence profiles to come together in just the right way at just the right time in these dynamic teams-a promising area of further study (see Section 8.1).

| Hygiene, excellence, and efficiency
Combining the findings from literature with the interview statements, there is an argument to be made for stable teams as straightforward and intuitive.This stems from the stable team's relative simplicity in organizing, as opposed to the dynamic team's greater demands on leadership, F I G U R E 3 An iron triangle of teaming.
organizational support, and clarity of purpose.On the other hand, the analysis also suggests that dynamic teams may have greater potential when successful.Additionally, each of the two approaches may be more or less appealing to the individual member, based on their preferences.
The numerous claims in literature and interviewee statements (see Sections 3 and 4.1) regarding the comfort and safety of stable teams indicates that one reason why such teams may be less demanding of the organization is that they provide a mechanism to achieve the psychological safety and team cohesion needed for the team to function well.It is equally clear, however, that it is not the only mechanism available to achieve these properties.In other words, to an organization that is unable to provide a sense of safety and cohesion through other means, stable teams represent an option to assuage this lack.In an organization where they can successfully be achieved through other means, however, there may be less need for team stability.
There are numerous psychological theories on motivation that can help shed light on this need for psychological safety and team cohesion and why it may be important but not sufficient.Perhaps, the most well known is Maslow's theory of the hierarchy of needs, 120 through which they may be viewed as existing at a relatively low level in the hierarchy, below, for example, achieving mastery and expanding one's network of contacts in the organization.These needs may also be viewed as examples of needs based on affiliation versus achievement and influenced by extrinsic factors versus internal drivers, in line with McClelland's theory of achievement motivation. 121other perspective on this phenomenon is that of Herzberg's dual-factor theory of hygiene factors and motivators, 122 where psychological safety and team cohesion are examples of hygiene factors: They are necessary for the team to not be dysfunctional, but they are not sufficient to excel.Meanwhile, the properties associated with successful dynamic teams (e.g., greater opportunity to take ownership and to learn, network, and evolve as a person-aspects of mastery, purpose, and autonomy 123 ) represent motivators: the things that make us not just function, but to excel.Combined with the long-standing findings of team smallness as a factor in efficiency (see Section 5.2), this suggests a further evolution of Figure 3, where team stability primarily furthers team hygiene factors, team smallness furthers its efficiency, and team wholeness furthers its ability to excel, as illustrated in Figure 4.
In such a model of teaming, an organization able to satisfy the hygiene factors of safety and cohesion through other means would be free to reduce team stability in the interest of furthering efficiency and excellence through team smallness and wholeness, respectively.Meanwhile, an organization that cannot otherwise satisfy those hygiene factors-either due to inability or the personal preferences of its members-may sacrifice smallness and/or wholeness of the team in order to provide them via team stability.

| The consequences of stable and dynamic teaming
Following the above discussion and returning to the research question informing this study (see Section 2), the consequences of stable versus dynamic approaches to teaming are found to be different sets of trade-offs between the three team properties of stability, wholeness, and smallness.
• A dynamic approach to teaming can afford wholeness and smallness even in a large-scale context but comes at the expense of the improved psychological safety and team cohesion afforded by team stability.
• A stable team approach is conversely able to build psychological safety and team cohesion over time, consequently requiring less support from the larger organization, but will instead struggle with achieving wholeness and/or smallness.
F I G U R E 4 An evolved iron triangle of teaming.
The desirability of these three team properties may vary from case to case and is also perceived differently by individual team members.This in turn calls for deliberate consideration of teaming approach-as opposed to the unmindful choices sometimes made by organizations, as reported by the interviewees.It is in this deliberate consideration that a conceptual model illustrating the trade-offs can be helpful, by visualizing and concretizing relationships that may otherwise go unexamined (as confirmed by the practitioners participating in the validation workshops, see Section 6).This is important, as an understanding of team characteristics and how they interact is critical in assembling teams that are fit for purpose in any given situation.

| VALIDATION WORKSHOPS
Two validation workshops were carried out at two companies representing additional industry segments (logistics and visual surveillance, respectively) from the two companies previously involved in the study (see Section 2.4).These workshops were attended by 6 and 8 engineers, respectively, with between 7 and 30 years of software industry experience (median 17 years) representing a mix of roles, including developer, tester, architect, project manager, product owner, line manager, and subject matter expert.The workshops were concluded with the researcher asking the participants the questions listed in Table 2.
• WQ 1 (Relevancy): In both workshops, the participants unanimously agreed that the question of stable and dynamic teaming is relevant.
The participants from one company noted that "it was interesting to discuss it in terms of two alternatives where one isn't necessarily superior to the other […] these ideas are new to us," while those from the second company agreed that "this isn't talked about [but]   interesting," concluding that "it's good and important that this study is conducted." • WQ 2 (Accuracy): All participants in the workshops agreed that they did not find anything inaccurate, with the caveat that as personal backgrounds differ, not every participant recognized every study observation from personal experience.Some participants pointed out conclusions they found particularly accurate, for example, stating that the conclusions from the study made them realize that the highperforming teams they had experienced had been comprised of a stable "inner core" and a more dynamic "outer layer." • WQ 3 (Generalizability): All participants in both workshops agreed that the conclusions are generally applicable, stating, for example, that "this is a fairly common pattern," "in all companies I worked in it's been the same structure, the same kinds of problems [with] a lot of commonalities from company to company," and "we see the same problem in other industries, too, [because] the question of teaming is universal." • WQ 4 (Usefulness): In one workshop, all participants agreed that the conclusions were useful.To exemplify, one line manager explained how the study conclusions opened up "a completely new dimension to keep in mind" and that they were "applicable straight away."In the other workshop, all participants agreed that the conclusions were potentially very useful if they were turned into a training program offering more concrete guidance-a training program "the entire management layer" of the company would benefit from.That being said, the participants also agreed that the workshop itself was valuable, in that it served to create awareness of a previously unrecognized question.
In both workshops, it was pointed out that implementing changes based on the study conclusions is challenging, particularly because it cannot be done in too small a context: "a single manager cannot do it on their own […] it needs to be a large part of the company."

| THREATS TO VALIDITY AND RELIABILITY
This section presents and discusses potential threats to the validity and reliability of the findings.

| Construct validity
Important threats to construct validity in this type of study, which relies in part on interview data and its interpretation, are ones related to researcher and participant expectations and attitudes.For instance, interviewees may try to guess the researcher's hypothesis and adjust their answers (consciously or not) to fit that hypothesis.Similarly, the researcher may provide cues (again, consciously or not) that influence interviewees to provide answers in line with the researcher's expectations.To exemplify, the mostly positive view of dynamic teams (and correspondingly negative view of stable teams) that emerged during the interviews (see Section 4.1) could conceivably be an artifact of interviewer bias or the interview protocol.To protect against this, the interview protocol was designed to explicitly encourage the interviewees to think of both positive and negative aspects of either teaming approach.It is reasonable to believe this had an equalizing effect on the number of statements per theme; in other words, more open questions may well have produced an even stronger disparity in the interview result.
Similarly, researcher bias based on personal experiences as a software engineer may influence the analysis of both literature and interview results.The validation workshops (see Section 6) serve an important role in mitigating this threat, as none of the 14 participants found the conclusions contradictory to their own experiences and observations.

| Internal validity
The author considers two salient threats to the internal validity of the study to be selection and mortality.In a study partly reliant on interview data, appropriate selection of interviewees is crucial to ensure unbiased and accurate results.To this end, the interviewees were not randomly selected from all employees in the studied companies, but purposively sampled to represent a multitude a roles and experiences of both dynamic and stable teams from complementary perspectives (i.e.not just developers, but line managers, project managers, team leaders etc.).Meanwhile, the threat of mortality concerns the loss of subjects (in this case, interviewees).Indeed, of the 20 planned interviews, only 19 were carried out as on of the interviewees withdrew from the study.This was due to reasons unrelated to the study, and the interview series was deliberately designed with some margin for mortality, thereby mitigating this threat.
In addition, any search engine chosen in a study of literature comes with potential biases.In the case of Google Scholar, as used in this study, there is a documented language bias in favor or results in English, 124 suggesting that relevant results in other languages may go undetected.For the purposes of this study, this is arguably not a severe threat, considering the dominant position of English within the software engineering community.
Indeed, the search terms used (see Section 2.1) were all in English.However, it should be noted that the possibility of variations along linguistic (and correlating cultural) lines is not covered in this study.It is also possible that the limit of 10 results per search query (with a total of 210 results each for academic literature and gray literature) excluded relevant sources from the review.That being said, the threat of individual excluded sources to the most salient conclusion of the study of related work (that findings as to the impact of team longevity on team outcomes are inconclusive and contradictory, see Section 3.1) is limited, particularly as this finding is explicitly shared by related work included in the result set.
Finally, all research is subject to threat from researcher bias-not least qualitative studies and single researcher methods-as all researchers will unavoidably view a research question through the lens of their particular experiences and priors.There are multiple procedures to mitigate such threats, including the review of conclusions by a third party 125 and triangulation, seeking "a wide range of different, independent sources." 126Consequently, to guard against misinterpretation or misrepresentation of oral testimony, all interviewees have been shown and acknowledged the correctness of the interview transcripts (see Section 2.2).In the interest of triangulation, data were collected from both written sources and oral testimonies, and from multiple independent communities and companies, covering both researchers and practitioners.Following this, the study and the conclusions were reviewed by a total of 14 engineers from two independent and previously uninvolved companies (see Section 6), further mitigating the risk of researcher bias leading to incorrect or irrelevant results.

| External validity
In the context of this study, external validity is particularly relevant in the sense of whether the findings hold across different software engineering organizations.The interviewed practitioners were drawn from only two companies, which may pose a threat to external validity.These companies were deliberately chosen to represent separate industry segments, however, and were complemented by validation workshops involving companies from an additional two industry segments (see Section 6).The confirmatory results from these workshops in combination with the study of literature, revealing experiences and findings very similar to those reported by the interviewees, suggest that applicability of the results is not confined to merely the studied companies, but apply more generally.Whether this applicability extends beyond the domain of software engineering is unclear: it is possible that software engineering has particular features that impact the outcomes of different teaming practices.It is worth noting that workshop participants with professional experience outside of software engineering believed that the findings were universally applicable, however (see Section 6).In conclusion, the author believes the precise extent of generalizability to be an important topic for further work.

| Reliability
In the interest of reliability, great care has been taken to document the steps of the study of literature (see Section 2.1) so that it may be independently repeated and confirmed.Interview results and their analysis are challenging to perfectly reproduce, particularly as the raw interview data cannot be shared for reasons of confidentiality.That being said, the interview protocol was designed to ensure accurate interpretation by allowing all interviewees to verify the correctness of the researcher's notes (as in the validation workshops).Furthermore, the interview protocol is described in detail to support any effort at reproduction in an independent setting-efforts that would also be valuable from a generalizability point of view (see Section 7.3).

3 .
Snowball forward to books and papers cited by those primary sources.4. Study later work by the primary sources' authors to capture any developments in the field subsequent to the National Research Council's overview (published in 2015).

T A B L E 3
Number of relevant software engineering papers per search query.