Types of domain and task‐solving information in media scholars' data interaction

The purpose of this study is to examine what types of domain and task‐solving information media scholars need while interacting with research data to create new knowledge. The study is situated within information interaction research in information science. The approach is user‐oriented and qualitative. The research data consist of interviews of 25 media scholars about their interactions with research data. In the analysis, deductive and inductive approaches were combined to identify domain and task‐solving information types from the interview data. The results show that media scholars needed two domain information types and three task‐solving information types while interacting with research data. The domain information types were (1) earlier research information and (2) experience‐based domain information. The task‐solving information types were (1) information about methods and tools, (2) information about rules and norms, and (3) self‐created task‐supporting information. Of these, experience‐based domain information and self‐created task‐supporting information have been less considered in prior research on domain and task‐solving information. The findings of this study are useful for providing support for data interaction. Furthermore, the study sheds light on the concepts of domain information and task‐solving information in the context of interacting with research data.


| INTRODUCTION
Information use is not an end in itself but rather a vehicle to accomplish some underlying task.The present research takes a task-based approach and examines what types of information are needed in media scholars' interactions with their research data in knowledge creation.Particularly, we focus on domain and task-solving information needed during the task.These information types are often intangible and contribute to persons' tacit knowledge formation.Therefore, they are difficult to examine.In information science, these types of information are not as popular as so-called task information, which means topical or factual information that contributes to persons' propositional knowledge (i.e., knowledge of facts or truths; Lemos, 2007, p. 2).Specifically, domain and task-solving information have received limited attention in research on interactions with research data.For example, prior research on these information types has dealt with different domains (e.g., design-related; Zhang et al., 2020) or focused more narrowly on search tasks (e.g., Wildemuth, 2004;Zhang et al., 2013), research planning (Vakkari, 2000;Vakkari & Hakala, 2000), one information resource (Choi et al., 2022), or one research data type (Hemphill et al., 2021).Furthermore, some studies on the information needed have focused only on situations of reusing research data (Börjesson et al., 2022;Faniel et al., 2019).In this research, we approach interactions with research data more holistically.We interviewed 25 media scholars, who in total gathered and used several types of research data, to examine what types of domain and task-solving information they needed in their interactions with research data.
The key concepts in this research are (1) data interaction, (2) media studies, and (3) information needed in tasks, specifically domain information and task-solving information.First, in this research, we focus on the interaction between human and research data, for which we use the term data interaction.Therefore, this research is positioned within the broader perspective of information interaction that studies the interaction between human and information (Fidel, 2012, p. 17).Furthermore, we define research data in a practical way, meaning any data that researchers use as inputs to create new knowledge, regardless of whether they prefer using the term research data, primary sources or research materials.In information science, one way to approach data interaction is to study information activities.For example, previous studies examined the information activities of humanities scholars (Leigh et al., 2021) and historians (Korkeamäki & Kumpulainen, 2019;Late & Kumpulainen, 2022), who used archival and digitized archival materials to create new knowledge.However, our focus is not on the activities but on the types of information needed in data interaction.We thereby contribute to the groundwork for understanding what these information types are.This is important because the different types of information may have different uses in task completion and may need different kinds of support to acquire them (Ingwersen & Järvelin, 2005, pp. 76-78).
Second, the context of our study is media studies, a field that lies at the intersection of humanities and social sciences (Jensen, 2012).This allows us to study interactions with the various research data types used in the field, including newer data types such as social media data.Furthermore, media scholars conduct research within the media landscape that is constantly changing and dynamic.The changes in the media landscape are often reflected in media scholars' research interests and they also affect the kinds of methodological and ethical challenges media scholars face while interacting with research data (e.g., how to study ephemeral digital content that disappears shortly after its publication; Bainotti et al., 2021;Kelly, 2022).The various research data types used, the continuous changes in the media landscape, and the complexity of the knowledge-creation process make media studies a very fruitful domain for studying data interaction.
Third, this research is based on Byström's (1999, pp. 45-47) typology of information needed in tasks.
The original context of Byström's work was municipal administration where many of the tasks were routine but it also included more complex tasks.Likewise, in media studies contexts, tasks are often complex and information-intensive.Byström's (1999, pp. 45-47) typology of information consists of three information types.The first type, task information, is specific in nature in that it is needed only for the task at hand.We examined task information in our previous study (Korkeamäki et al., 2022) with a narrow focus on a specific task type, data gathering, in media scholars' knowledge creation.In this paper, we examine the two remaining information types: domain information and task-solving information.Domain information refers to information within the task domain, whereas task-solving information deals with how to perform the task (Byström, 1999, pp. 45-47).Both differ from task information in being more general in nature and needed in several tasks.Therefore, in this paper, we take a broader scope on media scholars' data interaction.We look at media scholars' research processes from research planning to data analysis to examine what domain and tasksolving information they need while interacting with research data.We use the term "information" to mean information-as-process, where information concerns the process of becoming informed (Buckland, 1991, pp. 6, 107-109).Furthermore, we are interested in the types of domain and task-solving information needed, regardless of whether the information is new to the person who needs it or is in the form of the person's expertise or skills.Our study was guided by two research questions: RQ1.What domain information types do media scholars need while interacting with research data to create new knowledge?
RQ2.What task-solving information types do media scholars need while interacting with research data to create new knowledge?2 | LITERATURE REVIEW Byström's (1999, pp. 45-47) typology of information needed in tasks, on which this study is based, was derived from research in expert systems development (cf., Barr & Feigenbaum, 1981) and information science (Byström & Järvelin, 1995).Although the context of Byström's (1999) work was municipal administration where work tasks varied from routine to more complex, we considered the typology useful in the media studies context for three reasons.First, the typology consists of three, conceptually different types of information-task information, domain information, and task-solving information-each of which has its own significance in completing the task (Byström, 1999, pp. 45-47).This makes the typology quite general, yet distinctive, increasing its applicability to other contexts.Second, Byström (1999, pp. 105-108) discovered that the more complex tasks in municipal administration required more (potentially, all three) types of information to complete.Potentially, all three types of information are also needed in the complex, information-intensive research tasks of media scholars.Third, the typology is suitable for studying information as process (Buckland, 1991, pp. 6, 107-109;Byström & Järvelin, 1995), which is our approach in this study.
Next, in accordance with our research questions, we review the literature regarding domain information and task-solving information.Task information, being inherently different from the other two, was studied in Korkeamäki et al. (2022) and is beyond the scope of this paper.Byström (1999, pp. 45-47, 137) defined domain information as general information within the task domain.More specifically, Vakkari and Hakala (2000) and Vakkari (2000) distinguished between types of information in an academic context that can be regarded as domain information: background information (used to orientate oneself to the research topic), theoretical information (theories, models, and conceptual frames), and empirical information (empirical research results).Zhang et al. (2020) found that university students needed domain information for their design-related creative projects, which included background information, theoretical information, and historical information related to the topic of their projects.Furthermore, Li et al. (2022) discovered that domain information for creative projects was sought using search engines, question-answering sites, and social media sites.

| Prior studies about domain information
Prior studies have also focused on domain knowledge or domain expertise that individuals have and how this affected their search tactics, search performance or learning outcomes from search tasks performed.Wildemuth (2004) argued that medical students' domain knowledge affected their search tactics when searching a factual database in microbiology.Zhang et al. (2013) studied the impact of domain knowledge on search performance and made a distinction between knowledge about the search topic (topic knowledge) and knowledge that is not about the search topic specifically but helps to understand it (background knowledge).The authors found that participants' topic knowledge correlated significantly with search performance in individual tasks, whereas background knowledge correlated significantly with search performance over all tasks.Liu and Zhang (2019) found that university students and postdoctoral researchers with high and low domain knowledge levels differed in how they selected and viewed documents from search results.Furthermore, O'Brien et al. (2020) studied how domain expertise affects learning outcomes from search tasks performed in a digital library.The authors did not find significant differences in learning outcomes between domain experts and non-experts.

|
Prior studies about task-solving information Byström (1999, pp. 45-47, 109) defined task-solving information as methodological or instructional information that helps in performing the task.Vakkari and Hakala (2000) found that university students needed methods information especially in the later stages of writing research proposals.Kern and Hienert (2018) found that social scientists needed information about research methods and tools (e.g., about instruments, data collection, or analysis methods).In their study, Hemphill et al. (2021) studied social media researchers' data management practices.The results showed that skills and knowledge about methods and tools (e.g., analytical skills or knowledge about web scraping) and research ethics (e.g., "understanding of privacy issues/ethics of social media data") were considered important while working with social media data.Zhang et al. (2020) discussed that four of the information types identified in their study are comparable to Byström and Järvelin's (1995) problemsolving information: (i) instructions (e.g., process steps or demonstrations), (ii) tips, opinions, or recommendations, (iii) finished examples (e.g., to illustrate a technique), and (iv) information that inspires to continue the project.A study by Choi et al. (2022) presents a typology of information sought from one procedural information resource in a specific professional field of intelligence analysts.The authors found five information types: background information, term definitions, procedure applicability, information about detailed steps with rationales, and advice from people.
Existing research has also studied procedural information sources, source modalities, and search behavior.Li et al. (2022) found that search engines, videos, Q&A sites, and social media sites were used as procedural information sources for creative projects.Pardi et al. (2019) examined participants' navigation and viewing behavior during procedural search tasks and their choices between search results of two modalities (website vs. video sources).The results suggested that searchers may have preferences for different procedural information source modalities.Search log analyses showed that people search procedural information from the Web using natural language queries that begin with words like "how to" (Eickhoff et al., 2014;Völske et al., 2015).In their study about search tactics in solving how-to technical tasks, Rutter et al. (2019) found that participants were able to find procedural information that was relevant but not necessarily useful in solving the task.Furthermore, Urgo and Arguello (2022) discovered that performing search tasks with procedural objectives required more creating compared to factual and conceptual objectives.
Regarding system support for procedural information retrieval, Kelly et al. (2002) and Murdock et al. (2007) utilized characteristics typical to procedural documents.Yang and Nyberg (2015) proposed recommending subtask queries to users.Alemu and Huang (2020) constructed a procedural knowledge base containing task frames where tasks and related methods and actions are organized hierarchically.

| Data collection
We conducted a qualitative study.The data were collected using semi-structured and critical incident interview in November 2019-April 2020.All participants were adults and provided informed consents, thus ethical approval procedure was not required by the research organization.In Finland, an ethical review from a human sciences ethics committee is required when the research (1) deviates from the informed consent principle, (2) intervenes in participants' physical integrity, (3) involves participants under 15 years of age whose parent or carer is not asked for a separate consent or informed about the study, (4) exposes participants to exceptionally strong stimuli, (5) risks causing mental harm that exceeds the limits of normal daily life, or (6) puts the safety of participants, researchers or those closest to them at risk (Finnish National Board on Research Integrity, 2019, sect.4.2).None of these apply to this study.

| Recruitment and participants
We used purposive sampling to select participants.Specifically, we sought for a maximum variation sample (Patton, 2002, pp. 234-235) with the following criteria.We looked for academic researchers from the fields of media, communication and game studies who had an ongoing or recently concluded research project so that they still remembered the research process clearly.We also wanted to recruit participants from different academic career stages and from different universities to increase variability in their research experience, research interests, and research data.
Invitations were sent via email to potential participants in three universities in Finland.In addition, the invitation was presented face-to-face to one research group.We stopped collecting interviews after we had a sample that was diverse in terms of the sampling criteria (representatives from media, communication, and game studies; representatives from different career stages and universities; variability in their research experience, research interests, and research data) and was sufficient to identify patterns across the interviews (Patton, 2002, p. 235).Certain information began to repeat which indicates data saturation.Furthermore, many of the participants were articulate and reflective, which increased the quality of the interviews and their information power (Malterud et al., 2016).
In Table 1, participants' background information is presented in a coarsened or categorized form to prevent identification.We recruited a total of 25 participants for this study from three universities.Sixteen participants had a doctorate; their career stages ranged from post-doctoral researchers, university researchers, or university lecturers to professors.Nine were doctoral researchers.Participants' research experience varied from under one to over 20 years.They positioned themselves in the fields of media, communication, or game studies, with several of them mentioning more specific fields or approaches (e.g., film history, film studies, journalism, social media research, visual research, audience research, critical research, humanistic research, feminism research, or political research).Participants' research data also varied.For anonymization purposes in Table 1, their research data are typified as either researcher-generated data that exist only because the researcher gathered them specifically for the research or naturally occurring data that were originally created for other than research purposes and became research data when the researcher gathered and used them as such (Lester & O'Reilly, 2019, pp. 97-122).Participants' generated data included survey, interview, and workshop data.Naturally occurring data included journalistic texts, political texts, monographs, social media data (e.g., posts on social media or online forums), TV programs, films, and related material (e.g., PR material).

| Semi-structured interviews
In designing the semi-structured interview questions, we utilized a task-based information interaction (TBII) model (Järvelin et al., 2015) that is individual-oriented in how it emphasizes cognitive and behavioral activities of individuals.The model presents five information activities that are included in task-based information interaction.We utilized four of them (planning and reflective assessment, searching, selecting, working with information items) in planning the interview questions, whereas the fifth activity (synthesizing and reporting) was outside the scope of our study because we were interested in the data interactions rather than writing, which does not include direct interactions with research data.We also had interview questions about the research community (working in a research group) and rules and norms (see Allen et al., 2011) from the perspective of individual researchers.Scholars follow certain field-specific scientific conventions such as research ethics and recommendations on where the research data should be stored.Because such rules and norms are an important part of doing research, they are important to consider when studying media scholars' data interaction.
The participants were instructed to choose an ongoing or recently concluded research to be discussed in the interviews.The semi-structured interview guide (see Korkeamäki et al., 2022, appendix) had background questions, questions about participants' research topics and processes, questions about working in a research group (if applicable), questions about participants' research data (i.e., what were the data like), and data interaction (collecting, finding, selecting, analyzing, archiving, and managing research data), as well as questions about research ethics, ownership, and licensing related to the research data.Follow-up questions (e.g., for clarification or examples) were also asked (see Roulston, 2010, pp.9-32).A total of 25 semi-structured interviews were carried out (of which 15 face-to-face and 10 by phone).Switching to remote interviews was necessary because of the onset of the COVID-19 pandemic and to stay on schedule with the interviews.A phone was used instead of videoconferencing, because at the beginning of the pandemic there was uncertainty about suitable videoconferencing tools that meet the EU's General Data Protection Regulation (GDPR) requirements and because we wanted to treat all participants the same for the remote semi-structured interviews.When we later learnt of suitable videoconferencing tools recommended by the research organization, we offered some participants the option to demonstrate their work in a critical incident interview via a video connection (only one participant chose this option).The semi-structured interviews were audio recorded (46 min-1 h 16 min each, 24 h 26 min in total) and later transcribed word for word (290 pages in total).

| Critical incident interviews
Participants were asked to demonstrate their work in a critical incident interview (Flanagan, 1954) that was carried out at the end of the semi-structured interview.More specifically, participants were asked to demonstrate how they had recently worked with their research data, for example, how they had searched, collected, analyzed, or worked in other ways with their research data.The demonstrations helped participants to talk about their interactions with their research data in greater detail.We did not seek saturation in the critical incident data, but their purpose was to complement the semi-structured interviews.Only 12 participants took part in the critical incident interview (of which 11 were carried out face-to-face and one via Microsoft Teams).Some chose not to participate in the critical incident interview (neither face-to-face nor online) because they found it difficult to demonstrate just one part of their research.Online participation may have been low because not all people were necessarily accustomed to using videoconferencing tools in the early stages of the pandemic, or because doing the online demonstration required switching from the phone (used for the remote semi-structured interviews) to a videoconferencing tool.Ten critical incident interviews were video-recorded (6 min-21 min each, 2 h 6 min in total) and later transcribed (27 pages in total).Two critical incident interviews were captured by taking photographs.Names of persons or organizations were removed from the data.

| Data analysis
We analyzed the data by using a combination of deductive (theory-driven) and inductive (data-driven) analysis.Deductive approach is a top-down process where a theory is applied to the data, whereas inductive approach starts with the data to derive patterns (Ormston et al., 2014, pp. 6-7).We started the analysis with a deductive approach, and then we continued with an inductive approach.
In the deductive approach, we started with Byström's (1999, pp. 45-47, 109) definitions of domain and task-solving information.Byström defined domain information as general information within the task domain and task-solving information as methodological or instructional information that helps in the task performance.However, Byström studied them in a different context, which was municipal officials' work.Therefore, we started with what domain information and tasksolving information mean in the context of media scholars' research work and how participants talked about them in relation to data interaction.We continued with subcoding (Miles et al., 2020, p. 72) to identify different domain and task-solving information types.We compared the similarities and differences of the articulations of participants that dealt with their views and experiences about what kind of domain and task-solving information they needed to create new knowledge.During subcoding, the analysis was mostly inductive, which means that we openly explored what domain and task-solving information types could be identified from the interviews.As an exception to this, the idea of looking at information needed in relation to rules and norms in data interaction was influenced by Allen et al. (2011), who saw rules and norms as one of the contexts of individuals' information activities.Although we did not analyze the contexts of information activities, we paid attention to information that participants needed about rules and norms in data interaction.We conceptualized information about rules and norms as part of task-solving information because participants talked about them in relation to how to perform the task (e.g., how to collect research data in ways that are ethical).Furthermore, the viewpoint for identifying the information types was that they were needed in the research process, regardless of whether participants referred to them as knowledge, expertise, skills or new information.
Lastly, we used pattern coding (Miles et al., 2020, pp. 79-83) to identify a total of five information types that are presented in Results section.It involved going through the subcodes several times, reading the interview quotations attached to them, and reading the accompanying notes created during the analysis.We inductively looked for patterns across the subcodes where participants were essentially talking about needing the same type of information.We grouped the subcodes accordingly to reach a higher level of abstraction.Deductive approach was also present in the sense that we aimed for constructing patterns that are inherently consistent with Byström's (1999, pp. 45-47) domain and task-solving information while also remaining sensitive to what domain and task-solving information might look like based on our research data in media studies context.
The analysis process was iterative.Although the actual coding was done by one researcher (i.e., one researcher read the interview transcripts through several times and carried out the coding using ATLAS.tisoftware), the coding and analysis were discussed several times during the process in a group of three researchers to reach agreement.Example quotations (typical or showing variance) were selected and translated from Finnish to English to illustrate the findings.

| RESULTS
We identified two domain information types and three task-solving information types that participants needed while interacting with their research data.Domain information included (1) earlier research information and (2) experience-based domain information.Task-solving information included (1) information about methods and tools, (2) information about rules and norms, and (3) self-created tasksupporting information.They are presented in Table 2 and described in more detail in the following sections.

| Domain information types
In this study, domain information means information within the research domain (i.e., within each participant's research field or research topic).We identified two domain information types: earlier research information and experience-based domain information.The former concerns information about earlier research that participants gained in academic contexts, for example, through academic studies or research experience.The latter refers to information that participants gained through leisure activities or nonacademic work experience.Both contributed to the same purpose, participants' data interaction, but in differing ways.

| Experience-based domain information
Experience-based domain information is about understanding the world of the phenomenon under study because of personal experience or long-term interest in it.This type of information was gained in non-academic contexts.It included information gained through work experience that was external to academic work (e.g., working as a journalist in a media organization) and information gained through leisure activities (e.g., serious leisure, hobbies, or informal study).
Some participants talked about the connection between leisure activities and research work, saying that knowledge gained from leisure activities, personal media use, and experiences gave them new research ideas, helped them to understand the research topics better or look at the phenomena from different sides.

I think that, originally, this [research] topic came about because discussions related to this topic […] started to appear in some of my [social media] feeds […] (P19)
I have memories from this period that we are examining in our study.I have seen some of the programs as a child, so it does have some meaning to what kind of understanding one has of that object of study.(P25) There were also examples regarding data gathering and analysis.Some said that having worked with people of a certain age or profession or engaging in serious leisure was useful in figuring out where to find potential interviewees or survey respondents.
Because I have worked in [a media organization], I have some idea where this age group, or people [who could be potential research participants], where they go and how to contact them.(P3) Related to my free time […] through that I have found contacts and that is why [the organization] was selected.(P5) Two participants referred to their personal interests and to their own social media use when describing how they decided to gather research data through a specific social media platform.One of them described making the choice as follows: There was this [social media] group related to [an event], where I was also [a member], partly because of this research topic and my interest in it, but also because of myself, out of personal interest.[…] So, I contacted the admin of the group and asked if it's ok to post this [interview] invitation to the group […] (P20) For a participant, who collected social media data, having observed the phenomenon over several years (in academic and non-academic contexts) was useful in evaluating the coverage of the research material "that it starts to represent the scene more widely" (P3).Another participant talked about conducting qualitative interviews and explained how having the same kind of serious leisure activities as the interviewees can be useful in the interview situation.
If you do interviews […] when you are in the same world with the interviewees, you gain the trust in a different way […] and you speak the same language with them.(P1) Yet another participant explained how one's own knowledge and experiences about serious leisure activities can be helpful in analyzing research data related to the matter, while emphasizing the importance of being reflective on how this could affect the research.

When you look at [the research data] […]
you recognize certain things you have experienced and seen in your own life […] and then you can think about it also from that point of view.(P4)

| Task-solving information types
In this study, task-solving information deals with how to conduct the research.Participants needed information about methods, tools, rules and norms.They also created task-supporting information to monitor their research process and to help with their thinking process.

| Information about methods and tools
Information about methods and tools means information about suitable research methods and tools for data collection and analysis and information about how to use them.Several participants used familiar research methods and tools, meaning they had used them before in other research contexts.
But this was maybe the first real [research] project where I realized that, aha, now this [tool] proved to be useful […] (P19) Participants also looked for new information about methods and tools.Some needed advice and guidance.For example, one participant expressed uncertainty about using specific analysis methods and tools and felt having to "reinvent some wheels" because there was nobody to ask for advice.Information was also sought on research methods and tools used by others in similar research settings.

I tried to find if someone else had done something similar and I didn't really find anything […] I had to create the process myself […] (P7)
Choosing the tools also involved testing and comparing different options in terms of how they would fit the specific research purposes.For example, one participant tried writing about data items to a spreadsheet but was not yet sure if it was the best format for the purpose.
I tried to start writing about each [data item] to Excel, but I don't think it was that good of a format for it […].I don't know yet how I will do it […] (P6) Some used good enough options available because they were not sure whether more suitable tools exist or did not have access to them.A participant, who was copy-pasting textual data items to self-created worksheets, said that the way of working was not ideal because differentiating one's own analytical notes from direct quotations required a lot of manual formatting.
Information was also gained in collaboration.A participant described how the members of their research group gathered around a same table to search for suitable tools for data collection and to practice how to use them.From workflow perspective, there was also the question of data format compatibility across work phases and suitable data storage options.Some were not sure of how different file formats can be processed with the tools or had trouble opening old files created with an older version of the analysis software.There was also adjusting between available data formats and initial ideas of ways of doing things.
It's a bit open how [the analysis tool] works with the PDFs.If they are readable and [the analysis tool] can do those searches, it makes it a lot easier.But if not, it's going to be a bit trickier, or it will then limit some aspects of the analysis.(P14) Data storage spaces were needed for backup copies and while actively working with the data, sometimes across organizations.Some had uncertainty about the suitability (e.g., security) of specific storage spaces.One participant reflected a better way to access and work with large video files stored in servers with a low bandwidth.Thus, functionality of the tools and data storage spaces in relation to the task was important.
Self-created codebooks and instructions contained information about how to collect or analyze the research data.For example, one participant said that a member of their research group, who was an expert in a specific research domain, wrote instructions for others about what to look for in the analysis.Version management and detailed notes about the decisions made were also important.
Furthermore, participants talked about skills needed in task performance and division of labor in research groups.A participant said that within the group, they combined the skills of qualitative and quantitative oriented researchers when collecting and analyzing the research data.Another participant said that the lack of technical skills needed was overcome by hiring a person who had the know-how required for the task.

| Information about rules and norms
First, participants' articulations were related to wider ethical discussions in scientific research.For example, the articulations concerned encountering differing opinions in a research community about ethical use of research data, balancing between the freedom of science, the rights of social media users and the rights of social media companies when using social media data for research.The articulations also concerned changes in social media platforms and how they create new ethical dilemmas for using social media data for research purposes (e.g., which methods are allowed in gathering the data), and also how the changing norms regarding the ethical use of social media data caused uncertainty whether previously gathered data can be reused for further research purposes.
Second, participants needed information about ethical and data protection procedures in conducting the research.The ethical principles for research are needed early on.For example, the research group of one participant chose a topic and study design where the ethical review of the research plan was not necessary.
Ethical issues were on the table in that sense that we didn't want such topics that would have crossed the threshold of sensitive data, that one would have needed to ask a permission from the ethical board, and in that sense, it affected the data construction early on.(P17) Some reflected how to inform and contact potential research participants from an ethical point of view and how to anonymize research data for publications or archiving.Some participants, who would like to archive the social media data they collected, had not been able to do so because of difficulties in anonymization.Information about procedures also included data management guidelines and EU's GDPR requirements (e.g., how to follow them accordingly if the research data are stored outside EU/EEA).
Third, participants talked about rules related to different actors (actors meaning research organizations, archives or data owners).Research organizations' rules concerned user rights to data collection and analysis tools, for example.Some, who collected and used archival data, emphasized the requirement of following the rules of the archives.Participants also needed information regarding terms and conditions (e.g., who owns the data), copyrights and quotation rights (e.g., when using images from research data as illustrations).In action research, there was a need to negotiate the rights for using the materials that the study participants had produced during workshops.

| Self-created task-supporting information
Participants created information (e.g., log entries, memos, research diaries) in relation to their research data for the purpose of supporting the research process.First, information was created to monitor the research process (keeping track of the steps made, planning future steps).Sometimes, the notes made for own use were more detailed than what was reported in the research publication (e.g., taking screen captures of the analysis tool settings used).
There are certain kinds of choices [in the analysis tool settings] that are also pretty relevant to report or to keep them at least that you can later decide what is important to report.(P18) Participants also documented data management activities to remember what work stage they were in.These included version management and log entries about how research data files had been modified in the stages of preprocessing and analysis.Discussions and decisions in research groups were documented in the form of notes, memos or correspondence (emails, instant messages or comments in a word processor) and used as ways to go back to what was agreed or discussed.Furthermore, to-do lists were used to plan future steps.Sometimes the notes were very on point and written into the document to which the note applied (e.g., making notes to statistical software outputs).I've had these distributions and I've printed them on paper […] then I write notes on them, like 'do a cross-tabulation between those' […] (P10) Second, information was created to help with one's thinking process.This included analytical notes such as preliminary ideas and observations regarding the research data.For a participant, who studied films, notetaking involved making analytical connections between the films and research literature.

| DISCUSSION
In this study, we took a task-based approach to examine what types of information are needed in media scholars' interactions with their research data in knowledge creation.We had two research questions.We focused on domain information types (RQ1) and task-solving information types (RQ2) needed during the task.
For RQ1, we identified two domain information types.Earlier research information refers to information about what is known or discussed based on earlier research regarding the phenomenon under study.Experience-based domain information is about understanding the world of the phenomenon under study because of personal experience or long-term interest in it.Together, they show that domain information was obtained across academic and non-academic contexts.This is an important finding given that more holistic approaches have been called for, where information activities and needs are examined from the point of view of tasks without creating an artificial division between work and leisure contexts (Savolainen, 2023).
Of the two domain information types, earlier research information is more traditional in the sense that it is acquired within formal academic education and research work, and it is better supported by information systems and services.Acquiring experience-based domain information may require long-term commitment and personal engagement in long-term activities.This information type could be supported by facilitating meetings between domain experts and discussions around shared themes and experiences.These kinds of interactions have also been discussed from the viewpoints of informal learning interactions (Miller, 2015), networking (Falciani-White, 2017), and interactions with information intermediaries (Pontis et al., 2017).Furthermore, the finding of experience-based domain information is related to the broader discussion of how researchers' own lived experiences are present in their knowledge-creation process and interactions with research data (Berg, 2008).This would be a topic for future research.
For RQ2, we identified three task-solving information types: (1) information about methods and tools for data collection and analysis, (2) information about rules and norms regarding research ethics, data protection procedures and rules related to different actors, and (3) self-created tasksupporting information that is created and used to monitor the research process or to support one's thinking process.
First, regarding information about methods and tools, Hucka and Graham (2018) similarly reported that scientists and engineers sought information, for example, by searching the web, asking colleagues or by searching scientific literature to find what software others used in similar contexts.Our research adds to this by showing that the need for information extended beyond finding specific tools.Information was also needed to help apply the methods and tools in one's own research.This aligns with procedure applicability in Choi et al. (2022) and suggests that topically relevant method or how-to information alone may not be helpful in solving the task, as Rutter et al. (2019) similarly observed in their study that dealt with solving everyday how-to technical tasks.Furthermore, although Kern and Hienert (2018) identified that social scientists needed information about methods and tools for data collection and analysis, their research methods (diary, questionnaires) were more structured, lacking qualitative depth.
Participants in this research also talked about using familiar methods and tools, combining the skills of research group members, or using good enough options available for data gathering and analysis.Similarly, Hemphill et al. (2021) concluded that social media researchers benefited from familiarity with different tools and technical skills when working with research data.On the other hand, unawareness of relevant information sources is a barrier for selecting sources and accessing information (Savolainen, 2015).Our study suggests that, similarly, unawareness of suitable methods and tools limits the options when selecting them.Furthermore, not knowing the characteristics of the tools and whether they are accessible across organizations may cause friction in the work process and may involve workarounds to overcome them.
The second task-solving information type was information about rules and norms.The way participants talked about rules and norms indicated their use as tasksolving information in research tasks.In contrast, Byström (1999, p. 46) and Byström and Järvelin (1995) categorized laws as domain information.Understood together, this suggests that in different tasks rules and norms could be regarded as either domain or task-solving information.
The changing regulatory landscape was also reflected in the articulations of participants.This was partly because the interviews were conducted at a time when the EU's GDPR (2016/679) had recently come into force, and the research organizations' guidelines about the GDPR had not all been updated.The requirements and guidance regarding data management plans had also been updated not long ago.In consequence, some participants discussed how data management requirements and guidance have changed since they gathered their research data or expressed uncertainty regarding suitable data storage spaces.Second, participants talked about their own reflections on what kind of research use is ethical.The Cambridge Analytica scandal (see, e.g., Venturini & Rogers, 2019) was mentioned as an example of how norms surrounding the access and use of social media data for research purposes have changed.These articulations indicate that, although ethical and data protection information are needed in planning the research, question related to them may also arise in later stages.
The third task-solving information type was selfcreated task-supporting information.This type of information was created and used to monitor the research process and to support analytical and reflective thinking, and it was documented in different forms (in notes, research diaries and other task-supporting documents).This finding has implications for supporting data interaction and information sharing.First, researchers need data management skills (Poole & Garwood, 2020) and could also benefit from metacognitive support during task monitoring (see Järvelin et al., 2015).However, their analytical and reflective thinking during data interaction could also be better supported.This could be done by designing tools that support exploration, discovery, and insights during the data interaction.For example, Mosconi et al. (2023) designed a data story tool that supported not only data management but also analytical insights, conversations, and peer learning.Second, the various task-supporting documents created during data interaction could be important sources for sharing information to others-not just about how the research was conducted-but also about the intentions and rationales behind it, which may be important information to those who consider reusing the data for their own research purposes (Yakel et al., 2022).The task-supporting documents could also contribute to the creation of paradata, "information about the means (procedures, tools, activities) by which a certain body of information came into being" (Sköld et al., 2022).Furthermore, Huvila and Sinnamon (2022) discovered that one barrier for sharing method information was the difficulty in describing and conceptualizing one's own research process to others.This further highlights the importance of supporting researchers' analytical and reflective thinking and the creation of task-supporting documents during data interaction.In future, it would be useful to examine how well the current tools for note-taking and other documentation are integrated into the research processes and whether the creation of task-supporting information could be better supported in ways that benefit the researchers themselves as well as data reusers.Taking a better care of the task-supporting information created could also add value to the scientific work, for example, by turning the task-supporting documents into assets and providing links between them, research data and publications (see Pinel, 2021).
Regarding limitations of this study, it is possible that when media scholars talked about information they needed, they highlighted instances involving barriers in the research process.Difficulties are often easier to recall than straightforward workflows because dealing with them require cognitive effort.Nevertheless, Bron et al. (2016) similarly reported that media scholars mentioned challenges related to methods, tools, rules and norms.In future, it would be useful to investigate what domain and task-solving information types are needed in different phases of the data interaction by studying the research processes as they progress.
When creating new research data services for researchers it is important to understand holistically what kinds of information types are needed when working with research data.This is especially important in the field of media studies because the data used includes multiple types and formats.Further, this research brings new knowledge from the scholars' viewpoint and makes sense of their workings with research data.Typically, in information studies information is regarded only as task information, or topical information about the subject matter of the task.However, understanding how domain information and task-solving information are present in the researchers' work is as important as they guide the task performance process and frame the whole research process.

| CONCLUSION
We examined what types of domain and task-solving information media scholars needed while interacting with research data to create new knowledge.We identified two domain information types: earlier research information and experience-based domain information.We also identified three task-solving information types: information about methods and tools, information about rules and norms and self-created task-supporting information.The results show that the range of domain and task-solving information types needed in data interaction is broader compared to what has often been the focus of research in information science.Hence, it is important to take account researchers' point of view to understand their research processes holistically.The research deepens understanding of domain and tasksolving information and provides suggestions of how the results could be used for supporting researchers' data interaction.
It is not just writing about the film or […] what you see in the film […] I also like to do a lot of notes when I read […] I might write the name of a film in a note and [how what was read] fits to this and that [film] sequence […] (P24) Participants also wrote reflective notes about their research and their position in relation to the research.These notes helped reporting the study in a transparent way and worked as a reminder about useful observations or questions.I have written in a research diary […] I think a lot about my position and reflect on what this topic evokes in me and why this interests me.[…] [the writing] is actually a tool for that when I need to think about the ethics and my own position and […] the transparency […] then I have it stored somewhere.(P3)