Using goal–question–metric to compare research and practice perspectives on regression testing

Regression testing is challenging because of its complexity and the amount of effort and time it requires, especially in large‐scale environments with continuous integration and delivery. Regression test selection and prioritization techniques have been proposed in the literature to address the regression testing challenges, but adoption rates of these techniques in industry are not encouraging. One of the possible reasons could be the disparity in the regression testing goals in industry and literature. This work compares the research perspective to industry practice on regression testing goals, corresponding information needs, and metrics required to evaluate these goals. We have conducted a literature review of 44 research papers and a survey with 56 testing practitioners. The survey comprises 11 interviews and 45 responses to an online questionnaire. We identified that industry and research accentuate different regression testing goals. For instance, the literature emphasizes increasing the fault detection rates of test suites and early identification of critical faults. In contrast, the practitioners' focus is on test suite maintenance, controlled fault slippage, and awareness of changes. Similarly, the literature suggests maintaining information needs from test case execution histories to evaluate regression testing techniques based on various metrics, whereas, at large, the practitioners do not use the metrics suggested in the literature. To bridge the research and practice gap, based on the literature and survey findings, we have created a goal–question–metric (GQM) model that maps the regression testing goals, associated information needs, and metrics from both perspectives. The GQM model can guide researchers in proposing new techniques closer to industry contexts. Practitioners can benefit from information needs and metrics presented in the literature and can use GQM as a tool to follow their regression testing goals.


| INTRODUCTION
Regression testing is carried out subsequent to any change in the system to verify that the change did not impact the unchanged parts of the system. [1][2][3] It is a complex and costly activity, especially for large-scale systems with continuous integration and delivery, and can consume up to 80% of testing and 50% of maintenance cost. 2,[4][5][6] The research proposes test case selection and prioritization to deal with the cost and complexity of regression testing. 3,[7][8][9] Test case selection refers to selecting a subset of test cases from the regression test suite to test the effects of changes. In contrast, test case prioritization (TCP) guides an optimal ordering of test cases that can help achieve the desired goals. 7,8 If the selected suite is large, TCP can be applied as a subsequent process. However, test case selection and prioritization can also be applied independently. The primary goal of regression test case selection and prioritization techniques is to detect faults as early as possible. 3 One of the challenging aspects of software testing is to decide when to stop the testing. How much to test is an essential question as it affects the overall budget of the project. Especially in large-scale software development with continuous integration, it is imperative for the practitioners to decide how long they should be testing the software before releasing it. [10][11][12][13] Various authors have defined prediction models for stopping criteria while taking into consideration the testing time, effort, cost, reliability, and coverage. [13][14][15] The industry practitioners set regression testing goals and evaluate the achievement of these goals using their experience and product knowledge. From the practitioners' perspective, regression testing goals provide an opportunity to decide to stop running more tests, as the achievement of the defined goals gives them confidence about the attained quality, and they can decide to release the product. 16,17 A goal is an intended outcome of a process that a practitioner plans to achieve, and it should be realistic and measurable. Therefore a goal should be associated with the metrics which can be used to evaluate it. Regression testing goals could correspond to the predefined objectives that a practitioner wants to achieve by applying a regression testing process or technique. The achievement of these goals should be assessed using metrics (see, e.g., Eusgeld et al. 18 ). Furthermore, test case selection and prioritization should be based on regression testing goals. These goals may vary from organization to organization based on their priorities. 4 The goal of most existing regression testing techniques is effectiveness (increasing test suite's rate of fault detection). Some techniques encompass efficiency (i.e., execution time and cost) as a goal. Test coverage is also among the goals of various techniques as the assumption is that test cases with a higher coverage will detect more defects. 2 Coveragebased techniques aim to cover maximum code with fewer test cases. 19 Regression testing techniques utilize multiple sources of information, including coverage information, requirement information, and test execution history. Furthermore, to evaluate the outcomes, these techniques use metrics including average percentage of fault detected (APFD) and its variants, coverage-based metrics, and metrics related to execution time. 3,20,21 Various regression test selection and prioritization techniques have been proposed in the literature. 7 However, the adoption rate of these techniques in industry is not encouraging, and only a few techniques have been evaluated in the industry context. It is a clear indication of the gap between research and practice. 4,6,[22][23][24] Among the other factors, one important aspect is the disparity in the regression testing goals of practitioners and researchers. 16 There are few studies that have an explicit focus on regression testing goals, especially in an industry context. 16,17,25 This research aims to get a better understanding of the regression testing goals from the literature and practitioners' perspective. We, therefore, have reviewed the literature and conducted a survey with industry practitioners. For the survey, we opted for interviews and an online questionnaire as data collection methods. We incorporated the goal-question-metric (GQM) approach 26 to map regression testing goals with related information needs and metrics.
In an earlier study, 16 we investigated regression testing goals, in a more limited scope. We conducted a focus group-based study with the practitioners and researchers. The participating practitioners represented large-scale embedded software development companies, and the researchers are actively working on testing research. This study aimed to know the industry-academia perspective on regression testing goals.
The present study is the continuity of the earlier study and extends it by adding further data and insights concerning regression testing goals, information needs, and metrics. The earlier study used a focus group-based workshop with seven industry and academic participants. In contrast, the present study comprises findings from 44 research papers and perspectives of 56 industry practitioners (representing nine development domains). Table 1 presents a summary of how our current study extends the earlier study.
The contributions of this study are as follows: • Identification of some new regression testing goals from the practitioners' perspective.
• Mapping of regression testing goals to information needs and metrics.
• Identification of differences in research and practice concerning regression testing goal preferences and use of information needs and metrics.
• Formulation of a GQM model to present an integrated view of the perspectives from the literature and from practice that can be used as a guide to reduce the industry-academia gap.
The remainder of this paper is organized as follows: Section 2 presents a review of related work. Section 3 describes the methodology. Section 4 presents the results of the literature review and the survey. Section 5 discusses the implications of this study for researchers and practitioners, and Section 6 concludes the paper.

| RELATED WORK
This section discusses related work on regression testing goals, information needs, and metrics. Included studies are recent systematic literature reviews (SLRs), literature survey, and some empirical studies on regression testing.
We looked at 11 systematic reviews on regression testing published during the last six years (i.e., 2017 to 2022) 3,6,20,21,[27][28][29][30][31][32][33]  Rahmani et al 21 conducted a systematic review of regression TCP techniques proposed from 2017 to 2020. The authors classified the techniques based on TCP approaches (e.g., risk based and history based). The authors also investigated the metrics and source of information utilized with these techniques. The metrics reported in this study are the variants of APFD, execution time, code coverage, requirement coverage, and severity measure. The information used for these techniques is requirement information (e.g., requirement coverage and requirement dependency). The most-reported goal for the techniques proposed during 2017-2020 is increasing the test suite's rate of fault detection. In another review, 28 the authors classified the regression test prioritization techniques based on the approaches and metrics used. The criteria used in the prioritization techniques are cost, code coverage, and fault detection ability. The goal highlighted by the authors is effectiveness, and to measure the effectiveness, they mentioned the use of precision and recall.
Rehan et al 27  The goals listed in this work are increasing test suite's rate of fault detection, early identification of critical/severe faults, detection of faults related to changes, and coverage. Confidence is stated as the overall goal of regression testing. The authors stated, "the purpose of regression testing is to provide confidence that the newly introduced changes do not obstruct the behaviors of the existing, unchanged part of the software." Information needs mentioned in this study are test execution and fault detection history, code coverage information, and requirement information, whereas the metrics highlighted in this survey are APFD, APFDc, fault severity, code coverage metrics, and metrics related to changes.
Besides considering the systematic reviews of regression testing, we also reviewed some primary studies to see the trends of regression testing concerning the goals, information needs, and metrics. These studies either propose a regression testing technique or present practitioners' perspectives.
Jafrin et al 25 proposed an algorithm to prioritize test cases based on the rate of severity detection associated with dependent faults. In this study, the authors listed the goals for test cases and TCP goals. Prioritization goals listed in this study are increasing test suite's rate of fault detection, increasing coverage, confidence, increasing rate of high-risk fault detection, and revealing the faults related to changes. However, the authors did not explain the sources from where they have identified these goals.
Kwon et al 34 proposed an information retrieval (IR) and coverage-based regression test prioritization technique. Increasing test suite's rate of fault detection was considered the goal of the technique, whereas information sources utilized are code coverage and fault detection rate. The authors suggested using mutation faults in the absence of actual faults. To measure the test suite's rate of fault detection, the authors used the APFD metric.
In a survey, Engström and Runeson 4 stressed the need to define the organization-specific regression testing goals. However, the authors did not mention any such goals in the study. White and Robinson 24 performed an industrial study. The authors listed a few goals concerning regression testing. The goals observed in this study are early defect detection based on changes and critical defect detection. The study also presents the metrics like module dependencies, execution cost, time, number of test cases executed, and code changes. It is reported in various studies [35][36][37] that most of the regression techniques presented in the literature are using effectiveness (increasing test suite's rate of fault detection) as a goal. To measure the effectiveness, the authors are using APFD and APFDc metrics. In most cases, authors utilize test execution and fault detection history to evaluate their techniques concerning effectiveness (i.e., increasing test suite rate of fault detection). However, fault mutation can also be utilized to compensate for the absence of actual faults. 34 From the recent systematic reviews representing the regression testing research up to 2022 and survey by Yoo and Harman, 7 and the primary studies presented above, we learned that the common goal of most techniques is increasing test suite's rate of fault detection. Early identification of critical faults and coverage are also mentioned as goals of some techniques. The most utilized metrics in all studies are APFD, APFDc, and code coverage metrics. Most of the reviewed literature present the goals of regression testing techniques, and a few studies considered the regression testing goals a primary concern. Only a couple of studies considered this aspect from an industry perspective. Furthermore, we could not find a precise mapping between the goals and metrics. This fact motivated the authors to conduct a study to investigate research and industry perspectives on regression testing goals and related aspects. The current study is the continuation of our earlier work, 16,17 and it extends 16 by adding literature findings, and perspective of more practitioners representing more domains (see Table 1).

| METHODOLOGY
The study aims to characterize the regression testing goals from research and practice perspectives, and it also strives at comparing two perspectives on regression testing goals. Table 2 presents the research questions that further elaborate the study's aim. To answer the research questions, we have chosen to conduct the literature review and a survey. For the literature review, we selected the studies where the authors discuss regression testing goals, information needs, and metrics. We did not opt to conduct the SLR because, along with the in-depth analysis, an SLR covers the breadth of the existing literature relevant to the research questions. 38 An in-depth analysis of regression testing techniques is not the goal of this study. The only aim was to identify the regression testing goals, information needs, and metrics. For selecting relevant literature, we performed systematic searches that helped to include a reasonable number of relevant studies. We could have two alternatives to understand the practitioners' perspective of regression testing goals: (i) case study and (ii) survey. A case study investigates a phenomenon in deeper detail. It is a suitable method for the situations where context is important, and analysis of the cause-effect relationship is the aim. 39 In contrast, a survey helps to identify the characteristics of a larger population and it is a suitable method where the aim is to collect the opinions of a large sample. 39 In our case, we were interested to know the perception of as many practitioners as possible. Simultaneously, we were also keen to know some insight about the regression testing goals and other associated practices. Therefore, we decided to survey by opting for two data collection methods interviews and an online questionnaire. Interviews provided us an opportunity to have direct interaction with the practitioners and understand their perceptions. The online questionnaire helped us to reach the broader population and collect the information about regression testing goals from a larger sample.

| Study selection
To answer the first research question (RQ1), we have conducted a literature review of 33 selected papers. Though we did not conduct SLR, we used systematic searches and followed established methods to extract and present the selected papers' data. However, we do not claim the exhaustive searches of the studies, and we did not incorporate quality assessments. The reason was that the aim was to get a view of what goals, questions, and metrics exist concerning regression testing.
Search strategy: We followed snowball search strategies for the selection of studies. 40 Snowball search strategy helps find all relevant studies and still not get too many irrelevant papers to be excluded manually in the subsequent steps. 41 The first step in snowball searches is to find the start set, then we have to iterate the backward and forward snowball iterations. 40 Finding the start set: To find a start set for snowball searches, we used keywords based search to identify a basic set of papers, and used the following search string: ("regression testing" OR "retesting") AND ("goal" OR "desired effect") AND ("metric" OR "measure" OR "information need"). We applied the search string in IEEE, Scopus, and Inspec. We found a total of 175 research papers. We did title scanning of these 175 papers and selected 62 relevant papers for further processing. Later, we read the abstracts of the selected 62 papers, and after applying the inclusion/exclusion criteria, we selected 13 research papers to include in the start set for snowball searches.

RQ Motivation
RQ1. What are the goals of regression testing discussed in the literature, and what are the corresponding information needs and metrics to evaluate these goals?
The objective is to better understand which goals are considered by the researchers while proposing or evaluating regression test selection and prioritization techniques. Which information needs they utilize to achieve the goals. Moreover, to evaluate the goals, what metrics have been proposed by the researchers. We will also investigate to see if there is any mapping between the success goals, information needs, and metrics (e.g., what metrics could be used to evaluate a specific success goal?).
RQ2. What are the goals of regression testing defined by the practitioners, and what are the corresponding information needs and metrics to evaluate these goals?
The objective is to know if the practitioners define any goals to determine the success in regression testing. Moreover, to see if they use/define any information needs to achieve the goals and evaluate these with the prescribed metrics.
RQ3. How are the findings from the literature and the survey related?
The aim is to create an integrated view of findings on regression testing goals, information needs, and metrics and provide actionable guidelines for practitioners and researchers.
Snowball iterations: By taking the selected 13 articles as start set, we performed snowball iterations (see Table 3). In the backward snowballing, we examined the references of every paper in the start set. For the forward snowballing, we reviewed the studies that were citing any of the papers in the start set. For the identification of citations and searching for papers, we used Google Scholar. In each iteration, the papers were selected based on the inclusion and exclusion criteria mentioned below. In the event of selection, new papers further went through the snowball iterations. In the first iteration, we found 16 related papers, and in the second iteration, we found only four papers. We stopped the process after the second iteration because we could not find new papers related to our topic.
Additional searches: Because our initial searches were focused on the regression testing goals, information needs, and metrics, we did not focus on any development context during the snowball iterations. However, to overcome this limitation, besides the snowball searches, we looked at systematic reviews of regression testing published during the last 6 years (i.e., 2017 to 2022) to see if there are studies published lately considering the context-specific regression testing (goals, information needs, and measures). We searched the Scopus database to find the systematic reviews on regression testing, and we found 11 systematic reviews published during the last 6 years.
Inclusion and exclusion criteria: Table 4 presents the inclusion and exclusion criteria we used for the selection of primary studies. We selected regression testing studies that include goals, information needs, and metrics regardless of the development domain. Other constraints that were applied are the language of the article, publication stage, and availability of the article in full text.

| Data extraction
Before the data extraction, we went through the reading of selected studies, and after the first round of reading, we started identifying the goals, information needs, and metrics. We used different colors (green for goals, gray for information needs, and yellow for metrics/measures). After finishing with the color codes, we assigned appropriate labels (where required), and finally, we extracted data by using the data extraction form (see Table 5). Data extraction was performed jointly by the first and second authors. To ensure the correctness and consistency of data extracted from literature, the first author reviewed the second author's data. The second author reviewed the data extracted by the first author. Issues were discussed and resolved jointly. Finally, the third author did a random check of the extracted data.

| Survey
To answer the second research question (RQ2), we have conducted a survey comprising of interviews and an online questionnaire.
The interviews allow an in-depth investigation of any phenomenon, while the questionnaire provides an opportunity to broaden the scope of findings. 41 We have chosen to conduct interviews with the testing practitioners, as we were interested in understanding the practitioners' perspectives in a detailed manner. Later, to know the perspective of the testing practitioners at large, we distributed an online questionnaire among the practitioners of various companies. Along with the interview guide and online questionnaire detail, the following subsections present the steps carried to conduct the survey.

| Sample selection
For the surveys, sample selection from the target population is a crucial step. 65 Considering the challenge of selecting a representative sample of all testing practitioners worldwide using probability sampling, we chose nonprobability sampling methods (convenience and snowball sampling).
Nonprobability sampling provides an easy way to select samples using nonrandom sampling techniques, including convenience sampling, quota sampling, or snowball sampling. 66 To ensure the selection of suitable participants for the survey, we set a precondition that the participant must have worked or is currently working in regression testing. We made this characteristic mandatory for the interviews and also embedded this requirement in the online questionnaire. If a survey respondent has no experience in regression testing, he will not be able to continue with the questionnaire's subsequent sections. However, we did not put any boundaries concerning the years of experience. The reason for not limiting years of experience was that the more answers from people with regression test experience we gain, the more comprehensive the GQM model would be. The less experienced people may miss out on some of the goals in the organization due to lack of experience, they still contribute valuable input to the tree by providing a few goals/measures.
Interview participants: For semi-structured interviews, we used the convenience and snowball sampling methods. 65 We started with convenience sampling and contacted practitioners with experience in regression testing using our contact networks. Five participants responded to our first attempt. We started scheduling interviews with these respondents. Later, we opted for snowball sampling and asked the participating respondents to refer us to practitioners experienced in regression testing, who can willingly participate in the study. With these five participants' help, we reached six testers who gave their consent to participate in the study (see Table 7).
Online questionnaire participants: For the online questionnaire, we used the snowball sampling approach. 65 We asked the interview participants to provide us further contacts of practitioners with expertise in regression testing. We also sent LinkedIn messages to the testing practitioners and posted the link to the questionnaire in two testers' groups. From all these sources (i.e., contact snowballing, LinkedIn messages, and testing groups), we received 45 responses to our online questionnaire. The detail of online questionnaire participants is presented in Section 4.2, please see also Figures 1 and 2.

| Interview steps
Eleven practitioners of nine companies participated in the interviews. Seven of 11 respondents had testing experience ranging from 10 to 15 years. While two of 11 respondents had 2 years of testing experience, two had testing experience of 1 year. Complete detail of interview participants is presented in Table 7.
Interview guide: We designed the interview guide (see Appendix A) based on the guidelines of Runeson et al. 67 We opted for the open-ended questions in the interview guide that allowed the interviewees to present their views freely. The second author developed the interview guide, and the first author reviewed and revised it. Later, the third and fourth authors reviewed the interview guide and provided their feedback. The comments were discussed among the authors, and necessary changes were made in the interview guide. Finally, to test our interview guide, we conducted pilot interviews with two experts, and based on the feedback from these experts, we finalized the interview guide.
The interview guide is divided into three sections, including introduction, background, and regression testing. The introduction section explains the context and purpose of the study. The background section consists of questions to capture the interview respondents' background information, including their current role, testing experience, and current projects/product under test. In the regression testing section, questions were organized to understand practitioners' perspectives on regression testing in general and the success of regression testing in particular. Then, the questions to capture practitioners' viewpoint on regression testing goals, information needs, and metrics. In the end, we also added some questions to know the response of practitioners regarding the metrics we identified from the literature.
Interview conduct: We conducted semi-structured interviews, mainly containing open-ended questions, except the questions related to the metrics identified from the literature. To avoid researchers' bias, we did not include any question that could lead to a desired answer. For interview questions, please see Appendix A.
The interviews with open-ended questions make it hard to capture the complete responses by taking notes. There is a high chance of missing the essential aspects of the discussion. Besides taking notes, with the participants' prior consent, we audio-recorded all the interviews to ensure not to leave any piece of information from the participants' responses. Each interview took approximately 30 min.
Analysis: Data collected from the interviews were subject to qualitative analysis. For the analysis of qualitative data, we used thematic analysis, in the thematic analysis, the data is identified into themes and codes based on the frequency and relevance of the collected data. To carry out data analysis using thematic analysis, we followed a five steps process presented by Lacey and Luff. 68 • Transcription: Because all the interviews were audio-recorded, the first step was to transcribe the interviews. The first and second authors transcribed the audio records. In the next step, both transcribers verified each other's transcripts. We also used notes taken during the interviews to complement the transcripts generated from audio recordings.
• Organizing data: The transcribed data is organized in some specific order to make it uncomplicated and easily accessible. At the first stage, we assigned ID numbers to each interview, we also assigned IDs to sections of interview transcripts. We eliminated the information from the transcripts that were possibly revealing the identity of the respondents or their organizations.
• Familiarization with the data: Because the interviews were conducted by the first and second authors alternatively, therefore, to completely understand the context of interviews, we repeatedly went through the listening of recordings and reading of transcripts.
• Coding: For the preliminary coding, we used different colors to categorize different themes in the transcripts. The investigation's primary focus was on three themes, which mainly correspond to the research questions (goals, information needs, and metrics). We used green color to highlight the goals, gray color to distinguish information needs, and yellow color to represent the metrics.
• Themes: To define more specific labels, we clustered the definitions based on the similarity of views. For instance, one of the interviewees stated a goal as "The goal is that no fault with priority 1 (high-risk faults) should slip through. We want to make sure that the customer should not find any such fault." Another interviewee stated that "We try to maintain a 100% success rate, we do not want any fault slippage to our customer." Similarly, one interviewee stated that "All test cases in regression test pack should be executed with 100% pass, the goal is that customers should not find any fault." In all these statements, we can see that practitioners do not want any fault to be slipped to the customer.
Therefore, we grouped all these statements into one cluster. After arranging the goals and measures into relevant clusters, we assigned appropriate labels to goals, information needs and metrics by using literature findings. For instance, the goals discussed here were assigned a label of "No or controlled fault slippage." Validation: In addition to the above steps of interpreting and analyzing, after assigning labels, we validated our interpretations with the selected interview participants. In the feedback, we did not receive any complaint of misinterpretation or misquote from any of the respondents.
All of them were agreed that our interpretations are appropriate and closer to their perspective.

| Online questionnaire steps
Characterization of subjects: In response to our invitation to participate in the online questionnaire, we received 45 testing practitioners' responses. Questionnaire design: To conduct an online survey on regression testing goals, using Google Forms, we prepared a questionnaire.* The purpose was to expand the scope of our findings and, to some extent, validate the information collected from the literature and interviews. The questionnaire was designed by following the guidelines provided in Kitchenham  and all are close-ended. In this section, we embedded the questions on information needs and metrics with their respective goals. If a respondent selects a goal, then the possible list of information needs and metrics is displayed. This helped to keep track of the right information/metrics for the right goals. We used a 5-point Likert scale for the regression testing goals, which allowed respondents to disagree with any goal provided in the questionnaire. For information needs and metrics, we used the nominal scale. In addition to a specific list of information needs and metrics, the questionnaire provided free-text space for respondents to include information needs and the metrics of their choice. These steps helped avoid researchers' bias.
Questionnaire validation: To evaluate the survey instrument, one of the methods is the pilot execution of the survey. The purpose of the pilot survey is to identify possible problems with the questionnaire. 70 To test our survey questionnaire's validity, we conducted the pilot survey with the two practitioners. They did not raise any significant issue in the questionnaire except suggesting to elaborate on the study's purpose. Based on the feedback, we made a few changes to the questionnaire.
Questionnaire conduct: We used Google Forms to distribute the questionnaire to the potential respondents. With the help of interview participants, we contacted 14 practitioners and requested them to forward the questionnaire to the people working in regression testing. We also sent 80 requests using LinkedIn messages, and we posted our questionnaire in two testing groups. As an outcome of these invites, we received 45 responses.
Analysis: Data collected using the questionnaire was subject to quantitative analysis. We were supposed to present the Likert scales' summaries for the goals, information needs, and metrics selected/identified by the respondents. We used descriptive statistics for the analysis of the data. 71 The results are presented in form of summary tables and graphs (see Figures 1-3 and Table 6).

| Threats to validity
This study employed the literature review and a survey as the research methods. There could be potential threats to the validity of the results obtained through literature and survey. The following subsections discuss the threats to validity and possible mitigation strategies, following the guidelines provided in Wohlin et al 41 and Runeson and Höst. 67 Construct validity: This aspect of validity could be associated with the choice of treatment for the study and its expected outcomes. In our case, it could be linked to selecting studies for the literature review, selecting survey participants, and creating the GQM model.
For the literature review, while selecting the primary studies, we opted snowballing technique. 40 However, we cannot guarantee the exhaustive searches, but the consistency of findings is the evidence that we retrieved a sufficient amount of relevant studies.
While designing the survey instruments, we carefully followed the respective guidelines 67,69 for the design of the interviews and online questionnaire. Further, we conducted pretests by conducting pilot interviews and surveys. Based on the outcomes of pretests, we augmented our survey instruments. Concerning the survey participants, we used convenience sampling to select the initial participants. Later, we used the snowball sampling method to select the participants further. Given the specific focus of the study (experience in regression testing), it was not easy to recruit practitioners. To reduce this threat to validity to some degree, we created our GQM model from three sources ( Figure 4): the literature, the interviews with company representatives, and the survey. The consequence may be that we may have missed including some goals and measures in this model. Therefore, we do not claim to have developed an all-encompassing model for regression testing goals, information needs, and measures. Rather, we created a baseline that companies can work with and extend based on their context.
Internal validity: Internal validity threats mainly deal with the credibility (data collection and sample selection) of the study, that is, whether the obtained results are valid or not. Internal validity refers to the factors that affect the outcome of the research. We followed the well-defined search strategies to find the relevant studies and employed systematic procedures for data extraction and analysis. We used audio recordings for the interviews and selected the interviewees based on their experience and interest in the regression testing. Furthermore, after transcribing the interviews and assigning the appropriate labels, we validated our interpretations from the interview participants. For the survey, the questionnaire is updated and revised before distributing it. In the questionnaire, we used the multiple-choice questions, and to avoid the researchers' bias, in every question, we provided the option for the free-text response. provided the interview and questionnaire respondents' background information, which may help generalize the context.

Conclusion validity:
The conclusion validity threat deals with the quality of the conclusions drawn from the collected data. We ensured the triangulation for all aspects of data that is data collection and interpretation. This study's conclusions are the outcome of data collected from multiple sources (literature review, interviews, and online questionnaire). We employed well-defined methods for data interpretation and analysis. We also verified our interpretations from the selected respondents.

| Literature review
To answer RQ1, we have selected 33 research papers. The selected studies are those in which authors consider regression testing goals or propose or evaluate the regression testing techniques based on goals. Various selected studies are also specifying the information needs and metrics that could be used to evaluate the goals' achievement. Besides the initial searches, we also looked at 11 systematic reviews of regression testing published during the last 6 years (i.e., 2017-2022) to determine whether these systematic reviews lead to additional goals, information needs, and metrics. The detailed review of these studies is presented in Section 2. The findings of these studies are merged in the results. Table 6 presents a mapping of goals and corresponding metrics, along with the information needed to aid the assessment of the goal. We have created the mapping using the authors' descriptions in the studies and the following proposition: "To achieve/evaluate goal G, based on information needs IN, use the metric M."

| Regression testing goals identified from literature
This section presents a brief description of the regression testing goals found in the literature.

G1.
Increasing test suite's rate of fault detection: Finding maximum faults early and quickly is the objective of any testing process, and it corresponds to the effectiveness of any testing method/technique. 25,36 The goal is listed in 72% of the included studies.
G2. Early identification of critical faults: Finding highly critical faults early in the testing process is another performance goal for regression testing.
It refers to detecting the faults that could have a severe impact on the system under test and can exist in critical modules. This goal appeared in 30% of the included studies.
G3. Detection of faults related to changes: Early detection of faults introduced by the developers due to changes and bug fixes is another performance goal because the presence of such faults could break the regression testing. 43 Such faults should be detected as early as possible. The goal is listed in 10% of the included studies.
G4. Coverage: Covering maximum code with a small number of test cases is the goal of regression testing techniques. These techniques are referred to as coverage-based techniques, and 25% of the included studies refer to this goal.
G5. No or controlled fault slippage: Fault slippage is a phenomenon where the testing process fails to find a fault in software under test, and the product is delivered to the subsequent phases (e.g., release). This goal is highlighted in only two included studies, and both these studies represent the practitioners' perspective.
T A B L E 6 GQM mapping of regression testing goals, information needs, and metrics-Literature M7. Severity measure helps to identify the test cases that can reveal higher number of sever faults. Severity value could be assigned to the faults based on their impact on the product. 73 Average severity of faults detected (ASFD) is a metric used to measure the severity of faults. 74

| Survey
The survey findings consist of the interview results and the online questionnaire results. The following subsections present the findings from both means of the survey.

| Interviews
We have conducted semi-structured interviews with 11 practitioners from nine different companies. The participating practitioners' testing experience ranges from 1 to 15 years. Concerning development approaches, all the participants reported using Agile/Scrum. Table 7  The primary focus of investigations was to know the perception of practitioners about regression testing in general and regression testing goals and metrics in particular. Besides collecting the background information of the participants, we asked them to tell us how they are performing regression testing in their company, the scope of regression testing, and the goals and metrics they use to assess their success in regression testing. To get an overall overview, we asked the participants to elaborate on, "What is regression testing for them? Why and when they need to perform the regression testing?" The practitioners' perception of regression testing is presented here: I-1. "After adding new features or bug fixing we go back and try to see if this change has broken something else." I-2. "In the event of any change, we have to perform regression testing to ensure that we have not damaged the quality of the existing software product's functionality." I-3. "To ensures that nothing is broken and everything is working in the system." I-4. "To make sure that whatever the bug fixed in the previous release, those do not break the existing working functionality." I-5. "When we introduce new changes to the application, we perform regression testing on the other areas of the application that are not part of new changes. To make sure that the new changes do not affect the other parts of the application. The functionality of the other parts is working correctly." I-6. "Regression testing is to verify the existing functionality did not get affected with the update to the existing code, and that is the main idea of this." I-7. "Regression testing focuses on finding out the side effects that might cause because of bug fixes, or it might be side effects because of the implementation of the new features. Therefore our primary focus is to identify the side effects of the changes." I-8. "Because of changing technology, we need to add new features to our product, and after adding new features, we perform regression testing to ensure that the current product is working fine." . "In general, regression testing is system testing that ensures the software's quality after the changes to the system. In our case, the changes are enhancements, patches, or any configuration changes. Along with the changes, regression testing helps identify new faults that could be the results of the bug-fixes." I-10. "After every release, we need to make sure that the previous functionalities are workings. We have to ensure that all bugs are fixed, and there are no new bugs introduced so that the old functionalities are working together with the new functionalities." I-11. "Regression testing checks that after any changes, if the system is working? It is to test whether the system is working according to the mentioned functionality. Hence, regression testing is a kind of functional re-testing."

T A B L E 7 Interview participants
The practitioners use existing system tests for regression testing. They run the regression tests after modifications or bug fixing to see if the changes did not negatively affect the unchanged parts of the system. All the participants told us that they run a selected set of test cases while performing regression testing. However, the criteria for selecting the subset of test cases from the larger test suites vary among different perspectives. For instance, three of them (I-5, I-6, and I-7) select and prioritize test cases based on changes and their possible impact on the other functionalities. In some cases, practitioners (I-1, I-8, and I-9) told us that they have a predefined set of test cases applied to test if the basic functionality is working correctly after any system changes. Along with running the predefined set of test cases, they also run some sanity tests to ensure that other major functionalities are also working correctly. Three participants (I-2, I-3, and I-4) told us that they prioritize the test cases based on the functionality's importance. For instance, test cases that test the core functionalities will have the highest priority. One participant (I-10) revealed that they prioritize the test cases based on robustness, and one participant (I-11) told us that they prioritize the test cases based on the business impact.

Regression testing goals defined by the interview participants
To know the practitioners' goals for regression testing, we asked: "What are the goals that you think are essential to achieve success in regression testing?" To grasp the right perception of the practitioners, many times we needed to rephrase this question from different angles. Table 8 presents the summary of regression testing goals identified from interviews, whereas practitioners' definitions of the goals are presented in the paragraphs to follow. Table 8 presents the summary of regression testing goals identified from interviews, whereas practitioners' definitions for these goals are presented in the subsequent paragraphs.
Of the 11, six practitioners defined G5: No or controlled fault slippage as their goal, with varying descriptions. The statements of the practitioners regarding this goal are: I-1. "The goal is that no fault with priority 1 (high-risk faults) should slip through. We want to make sure that the customer should not find any such fault." I-2. "All test cases in regression test pack should be executed with 100% pass, the goal is that customers should not find any fault." I-4. "There should be no priority 1 or priority 2 defects in the system while releasing it to the client." I-6. "The goal is that we should not let a fault slip through that can break the existing application." I-9. "We try to avoid hotfixes after release, or at least we try to reduce the number of hotfixes." I-11. "We try to maintain a 100% success rate, we do not want any fault slippage to our customer." Four practitioners defined G6: Confidence as their goal. The practitioners want to be confident about the reliability and the reached quality of the product. The statement of the practitioners regarding confidence are as follows: T A B L E 8 Regression testing goals-Interviews results I-1 to I-11 are the practitioners' IDs and (✓) means that the goal was defined by the respective practitioners I-3. "The goal is to increase the confidence in the reliability of the system under test at a faster rate, and this could be only done with opting smarter approaches for regression testing." I-4. "With the regression testing, I want to be confident that nothing should be broken in the system under test." I-10. "We want to be confident that it is safe enough to release the product to the customer." I-11. "The goal is to gain the customers' confidence and trust, and this is only possible if we are confident that we have tested enough and it is safe to release the product." Three practitioners stated that G2: Early identification of critical faults is their goal, the practitioners perspective in this regard is I-1. "We try to identify the high risk faults, early detection of critical faults is our goal." I-3. "Early identification of critical bugs is our goal, we have to make it sure that there should be no sever faults in the key user functionalities." Two practitioners stated that G7: Test suite maintenance is their goal. Test suite maintenance is listed as a regression testing challenge in the existing studies. 4,17,75,76 The practitioners' perspective regarding test suite maintenance is I-5. "To keep the test suite updated all the time so that it really helps with future releases. A well-maintained test suite is always an essential requirement for your success in regression testing." I-7. " To make it sure that we should not miss any issues, we have to keep the test suite updated." Two practitioners stated that G8: Team's awareness of changes and overall application knowledge has a significant impact on the success of regression testing. In our interview-based multicase study, 17 the practitioners highlighted it as a success criterion required to aid the achievement of various goals. The practitioners' perspective regarding this goal is I-5. "To keep the team educated always with the new changes and with the overall application knowledge is important." I-8. " What are the fixes developers have made in the newer version? We need to learn that. We have to review the release note of the old version and get aware of the product. Having knowledge of such things is crucial for the success of regression testing."

Information needs and metrics defined by the interview participants
The next essential part of our investigations was to know the response of practitioners about information needs and metrics/measures to be used to achieve/evaluate the success in regression testing. We asked a series of questions for instance, we asked: i. Do you measure or evaluate the goals?
ii. How do you measure?
iii. Which are the information needs necessary to achieve the goals?
iv. Which metrics do you use to evaluate the success goals?
While responding to the question, "Do you measure or evaluate the goals?," the response of the interview participants was a mix of yes and no. The majority of them responded, yes, we do measure; a couple of participants straightforwardly said no, we do not measure; and some of the participants told us that to some extent, they analyze the results.
In response to the second question, "How do you measure?," one of the participants revealed that they are using an agile-based tracking system to track the fulfillment of the goals. Some participants narrated that they make guesses based on their experience and product knowledge, whereas a couple of participants are using defect count as a measure to evaluate their goals. They have a defined threshold to decide if they can release the product. For instance, defect rate per unit time and the number of critical defects versus total defects are used to evaluate the success. Another measure that is being used is the ratio between the number of defects and test cases.
From the responses of the participants regarding the questions, "Which are the information needs necessary to achieve the goals?" and

| Interview participants' responses on metrics identified from literature
In the last part of the interview, we presented the metrics that we found from literature and asked them to see if they recognize these metrics and what is their opinion about the usefulness of these metrics for measuring the success. Six metrics got recognition from the interview participants (see Table 9). Three participants (I-3 I-5. In actual projects, we would not be using any of the metrics defined in the literature.
I-6. We are not familiar with the metrics given in the literature. I-7. Some of the metrics presented in the literature may not be applicable in some areas. However, code coverage is a metric that is measurable.
I-9. We do not consider the metrics given in the literature, but somehow we follow these metrics as summary.
I-10. I am unable to answer this question because I am unaware of these metrics.
T A B L E 9 Literature metrics of regression testing, recognized by the interview participants

| Online questionnaire
To illustrate the research perspective, we used results from the literature. To highlight the industry perspective, we interviewed the practitioners representing nine companies from four different countries. We conducted interviews openly asking about goals, information needs, and metrics to check saturation (do we find more goals and metrics prior to surveying a larger set of people). Then, we see yes, we got two new goals but did not learn anything new for the metrics. We also got more qualitative information here (deeper insights). From the interview results, we observed that practitioners have a different perspective of regression testing goals and metrics. Although the practitioners have their own goals, to some extent, they recognize the literature goals. However, almost half of them did not support the metrics/measures presented in the literature.
To have insight from a larger set, we opted for an online questionnaire based survey. To avoid any misinterpretations, we provided a brief description of each goal. We listed all goals that we found in the literature and interviews. We received 45 correct responses of the practitioners working in different roles and having different experiences. Figure 1  The respondents are working on different domains, including accounting and finance, automobile systems, business services, embedded systems, Telecom, mobile applicant ions, and medical devices. Figure 2 presents the detail of product domain on which survey respondents are working. Regression testing is highly important for 58% of the respondents, important for 20%, and moderate for 22% of the respondents. Among the F I G U R E 1 Role and testing experience of the survey respondents F I G U R E 2 Software development domains on which survey respondents are working respondents, 27 of 45 said that they implement selective regression testing (i.e., running a selected sub of test cases), whereas 18 of 45 stated that they implement retest all policy (i.e., running all test cases in the regression suite). Concerning product releases, the majority of respondents replied that they have multiple releases for their products every year. Only two of 45 respondents revealed that they have one release for their product per year. Figure 3 presents the response of survey respondents on regression testing goals, and an overall mapping of goals, information needs, and metrics is presented in Table 10.

Regression testing goals
Of the 45 respondents, 26 (58%) practitioners selected increasing test suite's rate of fault detection (G1) as their goal, 13 were neutral, and six opposed this option. Thirty-nine (87%) of the respondents agreed that early identification of critical faults (G2) is a goal for regression testing, four opted to neutral, and two disagreed. Thirty-seven (82%) agreed that the detection of faults related to changes (G3)is an essential goal for regression testing, two were neutral, and six disagreed with this option. Only 18 (40 %) chose coverage (G4) as their goal, 22 (49 %) chosen to remain neutral, and five disagreed. Thirty-five (78%) agreed that no or controlled fault slippage (G5) is their goal, eight respondents opted to stay neutral, and only two respondents disagreed. Thirty-five (78%) practitioners chosen confidence (G6) as their goal, five remained neutral, and five disagreed. Besides these goals, 89% of the survey respondents suggested that test suite maintenance (G7) is an efficient way contributes to the success of regression testing. Similarly, 80% of the respondents emphasized that the team's awareness of changes and overall application knowledge (G8) are the primary requirements for success in regression testing.

Information needs and metrics
In the survey questionnaire, we embedded the information needs and metrics with every goal. We also provided the free-text space to allow the respondents to state any goal, information need, or metric which we may not have listed in the questionnaire. The survey respondents did not mention any new goals. However, they listed a few information needs and metrics other than those we listed in the questionnaire.
Information needs: The survey respondents have listed a few information needs. The most mentioned information need is the requirements information as it was listed against five different goals. For example, 73% of the respondents mentioned it against G8, 69% listed it against G7, 63% against G6, 42% against G5, and 5% mentioned it against G1. Code changes was listed as information need for two goals, 71% of the respondent mentioned it as information need to achieve G7, and 41% of the respondents thinks it is required to achieve G3. 57% of the respondents against G8 lists past fault detection history. Similarly, fault dependence is listed against G2 by 37% of the respondents. Along with the listed information needs, the survey respondents have stated their own information needs, including business impact against two goals G2 and G4, domain knowledge against G8, and coverage of impacted modules G3 and G5. Table 10 presents the detail of information needs along with the respective goals and metrics.
Metrics/metrics: From the interviews, we learned that practitioners do not use any metrics defined in the literature to evaluate the success goals. They mainly make guesses based on their experience and knowledge. Only six interview participants endorsed a few metrics defined in the literature. Therefore, it was interesting to see if the survey respondents recognize the metrics given in the literature. Using the results obtained F I G U R E 3 Regression testing goals-questionnaire results from the literature, we listed a set of metrics for each goal. The respondents could select the one, many, or none for the goals they opted to agree to or strongly agree. They were provided with the free-text space to provide their own choices if different from those provided. The majority of practitioners listed metrics/measures against each goal. They selected varying choices of metrics against each goal. However, the majority preferred to choose from the list of given options. A few of the respondents provided metrics other than the given list. For instance, two respondents mentioned pass percentage of the total number of scenarios executed as a metric to evaluate G6.

| Using GQM to integrate research and practice perspectives
From the interview results (see Table 8), it is evident that the majority of the practitioners defined more than one goal for success in regression testing. The same trend was observed in the responses to the online questionnaire (see Figure 3). Although the authors of most techniques proposed in the literature focus on a single goal, they mention other goals. It reflects that only a single goal cannot guarantee success. Furthermore, T A B L E 1 0 GQM mapping of goals, information needs, and metrics-Survey Finally, to be confident (G6) about the success in regression testing, testers want to ensure that no critical fault is being slipped through (G5) to the customer. This highlights the need for a holistic map of goals and associated selection/prioritization strategies.
Using GQM approach, we have created a model that maps goals, information needs, and metrics (see Figure 4). The model provides a combined view of the literature and survey findings, and it would be helpful for practitioners in adopting regression testing strategies suitable to their context. It will also help researchers to propose new techniques tailored to the industry's needs.
We To achieve a goal "G," calculate metric "M" using relevant information needs "IN" and opt for the respective regression testing strategy.
Using the model, we can derive the following guidelines: The performance of a test process could be gauged by the number of faults detected during the process. "Increasing test suite's rate of fault detection," is one of the regression testing goals identified in this study. The goal appeared in various studies, and the survey respondents also identified this goal. This goal corresponds to the effectiveness of regression testing and could be achieved by adding those test cases in the regression test suite, which have more fault detection capability. Using fault detection history and requirements information could help in selecting the fault-revealing test cases. The benefit of increasing test suite's rate of fault detection early in the testing process is quicker feedback on the system under test, early start of debugging, and ultimately reduced testing time and cost. 36 Some of the faults are crucial and can break the product. Finding such faults early in the regression testing process is critical. "Early identification of critical faults" is the part of the findings of this study. We found this goal in the literature, and the survey respondents also identified it. Critical faults could be of two types: (1) faults that affect the core functionality of the system under test and (2) leading faults, the faults that cause the other faults to appear later in the operations.
Uncovering the critical faults needs to identify the modules that are badly affected and then prioritizing the test cases which cover the identified modules. 56 If a testing process fails to identify such faults early, there could be adverse outcomes. More precisely, the overall goal could be to identify more severe leading faults early in the testing process. 25 Early evaluation of changes could help identify critical faults, especially faults related to changes and bug fixing. Detection of faults related to changes is also identified as a regression testing goal in this study. Achieving this goal requires selecting the test cases based on the changes/bug fixes in the system. Detection of these faults is crucial, especially for scenarios like fixing critical bugs in an emergency and running tests under tight time schedules. 52 In selective regression testing, an important aspect that a testing practitioner considers is how much code would be covered by the selected test cases. 77 One of the interview participants stated that "Whenever we get a new build, we try to find the defects based on the changes and find the percentage of code covered against these newly found defects. So code coverage can also be considered as a part of success criteria. It is also imperative because it reduces the effort and cost of regression testing." The coverage alone could not be a goal of any regression testing process because maximizing coverage cannot guarantee fault detection. Instead, it will help in minimizing the test execution time and cost. The coverage can be evaluated using code coverage metrics like method coverage, statement coverage, and branch coverage. The techniques using coverage-based metrics could additionally use the program mutation to measure the fault exposing potential. While releasing a product to the customer, the team wants to ensure that customer should not find any fault after release. In our previous study, 16 the practitioners labeled this goal as "no-fault slippage." We argue that setting a goal of no-fault slippage does not mean that there would be no fault slip through (FST). Therefore, more appropriate would be to set a goal of controlling or minimizing the FST. The majority of interview participants and survey respondents highlighted that controlling fault slippage to the customer is one of the essential success goals of regression testing. Fault slippage is the primary reason for higher rework costs. Damm et al 79 introduced a metric called FST to determine the faults that would have been more cost-effective to find earlier in the testing process. Keeping fault slippage rate as low as possible may help the managers decide about releasing the product, provided if they are confident that no known fault is supposed to be slipped through. 16,17 This study suggests that to control the fault slippage the practitioners need to focus on the other goals (G1, G2, G3, and G4).
Practitioners frequently use the term confidence concerning their success in regression testing. They want to be confident about their regression testing process that they have uncovered and fixed all such bugs that can break the system under test. "Confidence" appeared as a regression testing goal in the studies conducted in the industry context. In a focus group workshop, 16  While interacting with the practitioners during this and our past studies, 6,16,17 we learned that practitioners do not evaluate the achievement of the goals as they do not have any mechanism to follow the goals' achievement. Instead, they rely on expert judgment to guess the achievement of their goals. However, making a judgment without a formal mechanism may negatively impact the outcomes. Moreover, without a formal mechanism, the practitioners may overlook some essential aspects while making assessments/judgments. 80 As a step forward, we have proposed a GQM model to guide the practitioners to follow the goals. However, better information maintenance strategies are required to ensure achieving/ evaluating regression testing goals. The practitioners are aware of this as they recognized that information maintenance is a challenge in the companies, and there is a desire to improve the information maintenance strategies. 17 We argue that the GQM could potentially be used in many organizations that conduct regression testing. What the organizations would have to do is to prioritize the goals for their specific context. They can add their goals that are not yet captured in the model. Further, the organizations have to choose the metrics they wish to use. One factor here is the cost of collecting the metrics, which may vary with the test framework used (e.g., lacking the ability to collect the measures automatically).
Thus, depending on the context, different measures would be chosen. The companies could use the method in Gencel et al. 81 Having a starting point (our model) will help them use the method and select relevant measures.
The validation of GQM is not part of this work, and it is a proposal based on the literature and survey findings. However, in future, we are aiming to extend this proposal and validate it from industry practitioners. The evaluation will entail prioritizing the goals to decide on measures.
As the GQM is not static, new goals and measures will be identified with further contexts and developments. We plan to create guidelines for updating and extending the GQM with forthcoming studies. The current GQM serves as a baseline for people to use. With further usage, it will become completer and more comprehensive. In the future, we also would like to investigate the importance of different goals, questions and measures depending on context. This allows practitioners to select the right metrics. We plan to follow the approach suggested by Gencel et al. 81

| SUMMARY AND CONCLUSION
The study explored the regression testing goals, information needs, and metrics from the research and practice perspectives. The quantitative and qualitative data is collected using the literature review, interviews, and online questionnaire. The purpose was to present an integrated view of literature and industry perspectives on regression testing goals. To present the literature perspective of regression testing goals (RQ1), we have conducted a literature review of 33 research papers. In addition, we also looked at the 11 systematic reviews published between 2017 and 2022. Except for a couple of studies explicitly focusing on regression testing goals, most of the studies discuss regression testing goals while proposing, evaluating, or reviewing regression testing techniques. From the literature, we found six regression testing goals, two of them, are identified from the studies representing the practitioners' perspective. Most of the authors evaluate their techniques in terms of fault detection rate by using the APFD metric. The information needs mentioned to evaluate the fault detection rate are "requirement information," "test case execution/fault detection history," and "coverage-based information." Other goals mentioned in the literature are "early identification of critical faults," "detection of faults related to changes," and "coverage." The goals identified from the industry-related literature are "no or controlled fault slippage" and "confidence." A complete mapping of regression testing goals, information needs, and metrics found from the literature is presented in Table 6. Till now, there is a lack of literature review on the topic of regression testing goals. This study provides a step forward in this context.
To present the practitioners' perspective (RQ2), we conducted a survey comprising 11 interviews and 45 responses to an online questionnaire. We observed that the interview participants have varying perspectives on regression testing goals. In the overall survey results, we learned that, besides recognizing the literature goals, the practitioners emphasize on their own goals including (i) test suite maintenance and (ii) team's awareness of changes and overall application knowledge. They recognized only a few of the information needs and metrics identified from the literature. The practitioners also suggested some information needs and metrics including, domain knowledge, business impact, coverage of impacted modules, and pass percentage of executed scenarios. A complete list of goals, information needs, and corresponding metrics selected/ defined by the survey respondents is presented in Tables 8 and 10.
To compare the research and practice perspective on regression testing goals, we have created a GQM model (RQ3). The model presents an integrated view of the literature and the practitioners' perspectives. Researchers can utilize this model to align their research closer to the industry context while proposing the new regression testing techniques. Similarly, the practitioners can utilize this model to better follow the goal-based regression testing strategies. Based on the findings of our study, we suggest that researchers should consider multiobjective strategies while proposing and evaluating regression testing techniques. They need to incorporate no or controlled fault slippage (G6) as a primary goal of the proposed techniques. It will provide confidence to the practitioners that applying such techniques will help control the fault slippage.
The results provide a basis for future research on the evaluation of regression testing, and the GQM model presented in this study is a step forward in this direction. Furthermore, this study's findings will help the researchers propose new methods to align their research with the practitioners' regression testing goals. Hence, contributing to the adoption of research on regression testing in the industry. The identified goals and metrics will also help the practitioners to access the new techniques while adopting them. The metrics listed in this study can allow the practitioners to try out new metrics because many of these metrics are not incorporated in the industry. This study's overall contribution would be reducing the gap in the research and practice of regression testing.

| FUTURE WORK
In future, we are aiming to extend the GQM model and validate it from industry practitioners. We will also work on the possibility of using fault mutation strategies as an alternative measure of actual faults. Based on the findings, we plan to broaden the scope of this study and to work on the guidelines to help practitioners in selective regression testing. The guidelines will be in the form of checklist-based models and closely associated with the GQM model proposed in this study. It would be an empirical study, and all the proposals, will be validated with the help of practitioners.

DATA AVAILABILITY STATEMENT
We have provided the interview guide and link to the online questionnaire in the paper. However, due to NDA, we cannot provide original transcripts from interviews neither we can share the data collected from the questionnaire. However, we have provided sufficient detail in the manuscript. If any researchers want to replicate this study, they can benefit from the interview guide and online questionnaire. Furthermore, interested researchers can contact the corresponding author for further detail.

A.2 | Respondent background
In this section, the interviewers are interested to know about the participant's professional background, organizational role, and responsibilities.
Question 1: Could you please briefly describe your professional background?
∘ For how long you have been working with this organization?
∘ What is your role in the organization?
∘ For how long you have been taking up this role?
∘ What kind of products does your organization deal with? Question 2: How will you define your expertise?
∘ Software engineering ∘ Software development ∘ Software testing Question 3: Please specify about your current job.
∘ Your current team ∘ Your role in the team A.3 | Interview part to explore the regression testing goals We are heading to the core part of this interview, and we are interested to know about the practitioners' perspective on regression testing goals.
We will also be interested to know about the information needs and metrics used to measure these goals. Please feel free to add detail at any point of the interview that you think we missed asking or forgetting to describe.

Defining regression testing
We know the academic definition of regression testing, and we are interested in learning that perception of regression testing that prevails in practice.Question 1: How do you perceive regression testing?
Question 2: How do you perform regression testing?
Question 3: Are you satisfied with the regression testing approaches used in your team/organization?

Success in regression testing
To determine the success of any activity, we measure it with the predefined goals, that is, if the goals have met or not.Question 1: At your company / team do you define success goals?
Question 2: What are the goals that you think are essential to achieve success in regression testing?
Question 2: Which are the information needs necessary to achieve the goals?
Question 2: Do you measure the success? or Do you measure or evaluate the goals?
Question 3: How do you measure? Question 4: How will you determine that the desired goals have been achieved? Or Which metrics do you use to evaluate the success goals?

Closing questions
We mentioned earlier that this research aims to compare the literature and practitioners' perspective on regression testing goals. Because you have given us a walkthrough of your regression testing process and your success goals and measures, we want to know your opinion on the literature findings. We have identified the following goals and measures from the literature.Question 1: Which of these goals do you think are aligned to your perspectives?
Question 2: Which metrics do you think can be used in your environment? Question 3: Do you want to share some more information that you think is important to consider that we may have missed?