“How much change do you get from 40$?” – Analyzing and addressing failed questions on social Q&A

Authors


Abstract

Online question-answering (Q&A) services are becoming increasingly popular among information seekers. While online Q&A services encompass both virtual reference service (VRS) and social Q&A (SQA), SQA services, such as Yahoo! Answers and WikiAnswers, have experienced more success in reaching out to the masses and leveraging subsequent participation. However, the large volume of content on some of the more popular SQA sites renders participants unable to answer some posted questions adequately or even at all. To reduce this latter category of questions that do not receive an answer, the current paper explores reasons for why fact-based questions fail on a specific Q&A service. For this exploration and analyses, thousands of failed questions were collected from Yahoo! Answers extracting only those that were fact-based, information-seeking questions, while opinion/advice-seeking questions were discarded. A typology was then created to code reasons of failures for these questions using a grounded theory approach. Using this typology, suggestions are proposed for how the questions could be restructured or redirected to another Q&A service (possibly a VRS), so users would have a better chance of receiving an answer.

INTRODUCTION

The popularity of seeking information via online resources has radically increased since the invention and widespread adoption of the Internet and World Wide Web (Marchionini, 1995). This information-seeking process often begins with submitting a few keywords to a search engine. In more recent years, however, an increasing amount of people pose information requests as natural language questions to a community of fellow information seekers/providers in social Q&A (SQA) sites (Shah, Oh, & Oh, 2009). This approach has two immediate benefits. First, information seekers do not have to formulate a search strategy. Second, provided the answerer(s) understand the meaning of the question, they can provide the seeker with information personalized to his or her need. A clear distinction can be made between different types of SQA sites based on the designated roles of community members and how the question-answering process is structured. In addition to SQA platforms, libraries also transformed in response to user needs in online environments by offering a range of virtual reference services (VRS). While both SQA and VRS have evolved separately, it is often argued that there is a need to understand their similarities and differences, and most importantly the lessons that could be learned from each to help inform the other, thus providing better service to end-users (Kitzie & Shah, 2011).

One drawback to using SQA services is that there is no guarantee that a question will be answered, unlike VRS in which trained librarians have the capacity to elicit necessary information from the patron during the reference interview and to address questions that require research expertise. In these cases, implications for comparing these services become clear. If a question fails within one platform, perhaps it could be referred to another more suitable one so the asker has a better chance of receiving an answer. At the very least, research examining why some questions fail over others could assist askers in structuring a question appropriate to their Q&A platform of choice; therefore, this paper will develop archetypes for failed questions, defined here as questions that do not receive an answer, in order to determine their commonalities.

First a brief outline of SQA services will be provided, followed by a description of how data were obtained for this study and how the resultant typology used to classify failed information-seeking questions was developed. Then the method through which questions were coded using the developed typology will be outlined. Finally, a discussion of findings and possible solutions to address failed information-seeking questions, including a focus on bridging Q&A services as one possible solution, will conclude the paper.

BACKGROUND

SQA services are community-based, and purposefully designed to support people who desire to ask and answer questions, interacting with one another online. Unlike VRS where professional librarians and staff answer questions or provide resources in response to user needs, SQA services look to crowdsource the general public. People pose questions to SQAs and expect to receive answers from someone who knows something related to the questions, allowing everyone to benefit from the collective wisdom of many. In essence, SQA enables people to collaborate by sharing and distributing information among fellow users and by making the entire process and product publicly available. This platform encourages users to participate in various support activities beyond question and answer, including commenting on questions and answers, rating the quality of answers, and voting on the best answers. Within the past few years, various types of SQA services have been introduced to the public, and researchers have begun to evidence interest in information-seeking behaviors manifested within these contexts. The most popular examples of SQA include Yahoo! Answers (http://answers.yahoo.com/) and AnswerBag (http://www.answerbag.com/). The advantages of this approach are the low cost (most services are free), quick turnaround due to large community participation, and easy build-up of social capital.11 On the other hand, there is typically no guarantee regarding the quality of the answers. The asker is simply relying on the wisdom of the crowd.

While services such as Yahoo! Answers offer peer-to-peer question answering that constitute a one-to-one relationship between the question and each individual answer, there are also services that facilitate discussion among peers for questioning and answering. Such services allow a group of users to participate in solving a problem posed by a user of that community. This approach encourages peers to have a discussion between them, often redefining the information provided, rather than simply trying to answer the original question. WikiAnswers (wiki.answers.com) is a good example of this approach. The advantage of this approach is that the asker often gets more comprehensive information that involves not only the answer, but also the opinions and the views of the participants. In a way, this approach can be seen as a hybrid of Yahoo! Answers and Wikipedia. The disadvantage is that not many questions require discussion-based answers. In addition, with peer-to-peer services such as Yahoo! Answers, users can go back-and-forth between asking questions and leaving answers or comments, thus inducing an implicit discussion (Gazan, 2010).

LITERATURE REVIEW

A significant body of literature exists to categorize types of questions addressed by different online Q&A services. Harper, Moy, and Konstan (2009) distinguish informational questions and conversational questions in order to investigate their level of archival value by exploring the use of machine learning techniques to automatically classify questions. They argue that informational questions are more likely to solicit information that the asker may learn or use, whereas conversational questions are to stimulate discussion in order to obtain other people's opinions or to perform acts of self-expression. Kim, Oh, and Oh (2007) have investigated criteria that questioners may employ in selecting the best answer to their given question. They also study how the types of questions that users ask correlate to these criteria within Yahoo! Answers, finding that affective characteristics, such as answerer politeness, tend to matter more for conversational questions, while traditional relevance theory-based characteristics, such as quality and topicality apply more to informational questions (Kim, Oh, and Oh 2007). Their study of 465 queries found opinion-based questions (39%) to be most frequent, followed by information-seeking questions (35%) and suggestion-based questions (23%). This finding indicates that conversational questions that seek opinions or suggestions are generated more than informational questions in Yahoo! Answers. This observation is also reflected in the data examined within this paper, which indicates that approximately a third of the total failed questions sampled were informational in nature. This finding may be true for other SQA services and the generalizability of these results should be examined in future work.

Morris, Teevan, and Panovich (2010) focus on online social networking tools such as Twitter and Facebook to investigate both the frequency of types of questions asked, and respondents' motivations for using their social networks to post them. Their analysis of 249 questions found that the most popular questions generated from the social networking sites are recommendation and opinion-seeking questions, followed by factual knowledge-seeking, rhetorical, invitation, favor, and social connection questions. Factual knowledge-seeking questions were characterized as informational questions while other types of questions (i.e., recommendation and opinion-seeking questions, rhetorical, invitation, favor, and social connection questions) can be more likely characterized as conversational questions. Overall, it appears that online social networking tools and SQA services foster a sense of community manifested by self-expression, leading to a prominence of conversational questions within respective platforms.

Unlike SQA platforms, VRS tend to elicit a high number of fact-based, informational questions, or subject-related questions, since most users want to receive guidance on searching and information related to the librarians' expertise, rather than their personal opinions or thoughts. A study by Arnold and Kaske (2005) found that 41% of questions from chat reference transactions conducted in an academic library were related to library policies or procedural information, followed by subject search (23%), holding or known-item search (16%), ready reference (14%), and directional questions (6%). This finding illustrates that most questions posed in VRS are related to information seeking, and are not opinion-based or advice-based questions.

Based on findings from past literature, it appears that a dichotomy of question types exists within online Q&A services based on the platform; SQA tends to facilitate the dissemination of conversational questions, while VRS trends toward informational questions. It also is interesting to note that some studies have found that VRS questions consist of about one-third ready-reference and another third address such matters as library procedures, suggesting that VRS librarians with subject expertise might have the capacity to undertake more complex information-based questions if given the opportunity (Connaway & Radford, 2011).

METHOD

This section describes how a large corpus of SQA questions was obtained from Yahoo! Answers, and the process used to first identify information-seeking questions (as opposed to opinion/advice-seeking questions) and second to identify a subset of these that failed to obtain any answers. In addition, the development of the taxonomy of reasons for these failures and subsequent coding procedures will be described.

Data collection from Yahoo! Answers

The Yahoo! Search API (Application Programming Interface) for Yahoo! Answers was used to collect data. This collection process queried each of the 25 Yahoo! Answers categories for unresolved questions posted from November 2011 to March 2012. Since unresolved questions can have an answer, or even multiple answers that have not satisfied the end-user, unresolved questions themselves do not signify a failed question. The research team selected questions that had been posted for longer periods of time (about 2-3 months), which reflects the researchers' hypothesis that if a question was posted a long time ago and still not resolved then it was likely to have failed. This also reflects findings from a concordant study of reported SQA users, the majority of whom reported waiting no longer than a week (n=84, After a day= 42%, After a week=48%) for a satisfactory answer (Kitzie, Shah & Choi, 2012).

A total of 13,867 such questions were collected and the questions from this data that had answers were eliminated. The remaining 4,638 (about 33%) questions were those with zero answers, and thus were considered to have “failed.”

Identifying information-seeking questions

Having reviewed several opinion and/or advice seeking questions from the dataset, it was realized that often such questions are inherently difficult to answer (e.g., “Is happiness a choice?”). Since the end-goal for the research here was to be able to address failed questions, a decision was made to only consider the questions that were of an information-seeking nature. The information-seeking questions were chosen for analyses since they were judged to be the kind of questions that consistently had the potential of returning a relevant answer. Following is an explanation as to how question types were identified and information-seeking questions were selected.

Yahoo! Answers question-answer sets generated from November 2011 to March 2012 were examined and those determined to have failed (i.e., those generating no answers) were coded as belonging to four types, building upon the work of Harper et al. (2009) as shown in Table 1.

Table 1. Types of failed questions from Yahoo! Answers.
Question TypeExample
Information-SeekingHow much is a large shamrock shake?
Advice-SeekingWould an electric supercharger work on a 1.8L engine and provide noticeable differences?
Opinion-SeekingNokia N9 or Motorola Droid Razr?
Self-ExpressionBureaucracy is a necessary evil. Discuss?

Most of the resultant failed questions were found to be advice and/or opinion-seeking, with only about 5% of questions coded as information-seeking questions. This process ultimately resulted in 200 questions. The distribution of these questions and our overall data collection along the categories provided by Yahoo!22 can be seen in Table 2.

CODING

One trained graduate student assisted the authors in identifying the final set of 200 failed information seeking questions out of all types of failed questions extracted from Yahoo! Answers. The coding scheme and resultant definitions mentioned in the previous section were then developed and used to train two coders to classify questions based on the resultant typology. Also, since qualitative content analysis allows the researcher to assign a unit of text to more than one category (Tesch, 1990), the potential for each question to be assigned more than one code existed. Therefore, the coders were instructed to focus on not only the main attribute of failure for each question, but also on the multiple other minor attributes, which might slightly influence why some information seeking questions failed.

Both coders first analyzed five questions together in order to refine a congruent approach in treating each question and later coded 20% of the overall sample separately to ensure consistency. At this stage, the coders coded 40 questions with an initial agreement rate for the main attribute at 72.5%. After the first round of coding, coders discussed the observed mismatch between characterizations in order to further increase consistency before moving forward. During the second round of coding, the coders separately coded the remaining questions with a resulting agreement that was still below 80%. This led to another discussion to assess the clarity and comprehensiveness of the rules and instructions developed in the typology. When the researchers further examined the assigned codes side by side, it was observed that the main issues during the second round of coding were that the coders tended to split between the “ambiguity” category and the “too complex, overly broad” category; for this reason, the researchers revisited the coding scheme to explicitly establish how to differentiate between these categories and reviewed with the coders. The coders then recoded all of the questions that contained the issue mentioned above and researchers derived the results of the inter-coder agreement analysis (Kappa = 0.913, p < 0.001) for the main effect of why some information seeking questions failed to receive any answers. Cohen's kappa reported here is especially strong when considering the exploratory nature of this work.

DEVELOPING A TYPOLOGY FOR CODING FAILED QUESTIONS

Content analysis of 200 questions from Yahoo! Answers was conducted by a team of two coders in order to develop a typology for determining why informational questions failed to get answers. No past work was identified in the literature review that has attempted to identify characteristics of failed informational questions within an SQA context. For this reason, the analysis followed the constant comparative method (Glaser & Strauss, 1967) to identify the characteristics of failed questions based on deductions informed by related literature, prior expertise, and simple human judgment and develop a concordant typology based on their theoretical relevance. All 200 questions underwent repeated and close readings in order to identify as many characteristics of failed questions as possible (Burnard, 1991; Burnard, 1996, Hsieh & Shannon 2005). After obtaining saturation for an initial list of characteristics, the list was sorted into broader, higher-order categories (Burnard 1991; Downe-Wamboldt, 1992; Dey 1993). Table 3 lists the four main categories and nine subcategories of failed informational questions with definitions.

Table 3 depicts each of the four major categories along with descriptions of their sub-categories and examples from the Yahoo! Answers set of 200 failed questions. Note that within Yahoo! Answers there are two components of a question – title (required) and content (optional). The implication for the content component is for the asker to post additional contextual information relevant to the title question to assist potential respondents.

Table 2. Questions by Yahoo! Answers categories.
CategoryQuestions with #Ans=0Questions with #Ans=0Info-seeking questions
Arts & Humanities13120915
Beauty & Style287516
Business & Finance12425123
Cars & Transportation5840812
Computers & Internet22313311
Consumer Electronics1961707
Dining Out11946116
Education & Reference2426013
Entertainment & Music2121033
Environment6565812
Family & Relationships1971200
Food & Drink536168
Games & Recreation2091108
Health138520
Home & Garden12311608
Local Businesses31054020
News & Events6810639
Pets1284954
Politics & Government2652546
Pregnancy & Parenting1963230
Science & Mathematics3132206
Social Science2651741
Society & Culture3922131
Sports2314588
Travel939273
Total46389229200

1. Unclear

This category encompasses questions that lack semantic and/or structural clarity. The main criteria for a question to be labeled unclear are that the asker does not provide enough content, a coherent structure, and/or context for the person answering to correctly interpret the asker's information need.

Ambiguity

Questions coded into the ambiguity sub-category do not contain a coherent and/or clear manifestation of the asker's information need. These questions may also lack information; however, in some cases too much information hampers the implied directive of the question.

The following example33 illustrates how ambiguity negatively affects understanding the asker's information need:

How much change do you get from 40$?

This question falls into the ambiguity category since it can be interpreted multiple ways. Although the question is coherent, it remains unclear of the asker's intended information-seeking goal. Specifically, the change could be in various denominations of US currency and also the reader has not been given information as to how much or whether any money will be detracted from the $40 before the change is administered. Nevertheless, this category differs, but is not mutually exclusive from the lack of information category since, as per the example, the question clearly lacks information necessary to inform an answer. However this lack of information is a result of the ambiguity of the question due to the muddled nature of the asker's intent, which signifies the importance of including secondary effects within the coding scheme. The next section will provide further discussion regarding the non-mutual exclusivity of these categories.

Lack of information

This sub-category entails questions that lack enough semantic clues to inform the reader of the asker's information need, as can be seen in this sample question:

How much would transmission swap cost?

In order to answer this question, the reader requires additional information, such as the type of machine on which a transmission swap needs to be performed. This question also contains structural incongruities, which might impair the reader's ability to interpret its meaning.

Poor syntax

A question containing poor syntax has enough structural errors to mask the asker's intent, even if the actual meaning has been included. The following example illustrates how ill formed syntax, grammatical error, or Internet slang, discourages the interpretation of what the asker intends to ask:

How much % of computer user in India?

Even though the readers receive some informative factors (e.g., amount, subject, location) that may indicate what the asker attempts to ask, the ways these factors are combined disarticulate the asker's information-seeking goals, rendering it cumbersome to interpret the question. This type of failure signifies that when questions are ill formed and/or contains grammatical errors, it is more difficult for potential respondents to interpret the asker's intent and provide a response, even though some informative factors are provided.

2. Complex

Complex questions differ from unclear questions in that they provide enough information and are structurally sound enough to equip the reader to answer the question. However, they contain and/or demand an excessive amount of information to be exchanged, which can deter a response due to the implied amount of effort required to provide a quality answer.

Table 3. Typology for failed informational questions.
original image

Too complex, and/or overly broad

Unlike the adjective “broad” when applied within the ambiguous category, questions that are too complex or overly broad contain information that can lead to the creation of a semantically sound answer. This sub-category encompasses questions that indicate to potential respondents that excessive effort is required in order to craft a relevant answer. Alternatively, the question could be less demanding, however is so specialized that only a small amount of people would be able to answer it, so the probability of receiving an answer is low. An example of a question that is too complicated and has topics that are overly broad is:

Title: “What were the effects of slavery as an institution in Frederick Douglas Narrative of the life of frederick dou?” Content: “I understand what the Narrative of Frederick Douglas was about but to me it seemed more like he was just telling his life story rather than showing the effects of slavery.”

While the question makes semantic sense, it implies that the answer requires extensive research since the topic, Slavery, is such a vast one.44 Such an implied cognitive load/demand55 could discourage a reader from providing an answer.

Excessive information

This sub-category includes questions that differ from those that are coded as complex and overly broad, as it makes an unnecessary cognitive demand on the potential respondent, as the reader must identify the relevant information necessary for answering the question before he/she can even determine whether it can be answered. A question could certainly contain excessive information and also be too complex and overly broad, which would require cognitive efforts to both process and address the question.

This example illustrates how an excessive amount of information in the question might cause potential respondents to lose attention or interest in the question:

Title: “What is this book called?

Content: “I remember some details about this book, though they're quite random and I'm not sure if this book is a series or not. This is what I remember: There was definetly vampires involved, they were considered friends to people and were celebrities and stuff. But there were also vampire hunters, who saw through the act. Vampire lovers wore a necklace with a bat on it. I'm pretty sure the main girl, who is a vampire hunter, tries to rescue her sister who was kidnapped by vampires. They were betrayed by their dad, but later on when the dad is on tv saying that he is looking for the main girl, he is using morse code by tapping his foot and telling her not to come home because the vampires put him up to that. That's all I remember, what book is this??

The protracted length of this question requires the reader to devote an uncommon amount of time and attention to addressing it, which, for this reason, might discourage a response. In addition, the random assortment of facts interspersed throughout this example can become cognitively overwhelming for the reader to process, as he/she must attempt to remember each individual piece of information and combine them in a coherent manner to fully understand the information request.

3. Inappropriate

The content of inappropriate questions deviates from social norms and/or expectations inherent within the SQA community or society as a whole. In many cases, the implied objective of the question is not to fulfill an information need, but rather to elicit a reaction.

Socially awkward

Socially awkward questions are those where the aim of the asker may or may not be to fulfill specific information needs. In either case, the content of such questions do not adhere to the social norms fostered both by society and within the SQA community itself, often revolving around taboo and/or too personal topics. Thus, an example of a socially awkward question can be seen in the following:

How much would blacksingles.xxx fetch for?

Although this question might reflect a genuine information need, the subject of the question, relating to a website extension (i.e. xxx) for adult content, detracts from those willing to respond back to the question, owing to the perceived affiliation of a response to an abuse of the standards entrenched by the publicly open environment of SQA.

Prank

While prank type questions also can be considered too personal or taboo, these questions differ in the aim of the asker, which is not to fulfill an information need, but rather to elicit a reaction from the community. To elicit such a reaction, the asker will purposefully broach a personal or taboo subject and/or make a joke.

The following example represents a typical question in this classification:

What color are your underwear?

This question does not seem to represent a true information need and also refers to a personal subject.

Sloths

Another form of questions that may not violate a societal norm, but rather violate norms established within the SQA community is homework questions. Gazan (2007) divided SQA users into two groups — seekers, or those who are active participants in the community, and sloths, who only post questions and expect answers. He predictably found that seekers were more likely to receive answers than sloths. This finding also suggests the importance of network ties in influencing whether a question fails or not and should be explored in future study. An example of a question posed by a “sloth,” can be seen by the following example:

Title: “Question on Scent Post Studies?

Content: “Scent post studies

  • A.provide precise estimates of animal population trends
  • B.provide information on types of posts used by wildlife
  • C.should rarely be used because of the assumptions needed and false leads that they may provide.
  • D.provide useful information when used with other techniques”

This question implies a verbatim homework-based question in which the asker has put no effort into even initially beginning to find the answer independently. Gazan (2007) suggests that this perceived lack of effort is observed by the community and influences the decision for non-response. Even though homework related questions do not comprise a significant information-seeking question type in the sampled dataset, observations of this failure during coding were prevalent enough to suggest that further study should be conducted to determine the correlation between homework-based questions and non-response.

4. Multiple-Questions

This category differs from complex since it deals with multiple questions posed as a single information need. These multiple questions could confuse the reader since it might be difficult to identify the asker's information-seeking goal.

Relatedness

These question types examine the relationship between the question title and its content. Specifically, they constitute a mismatch in the clarifying questions provided within the content as compared to the question asked in the title. The following example illustrates how people may be confused when addressing content with more than one related question:

Title: “What is Warm Mix Asphalt?

Content: “I recently took a pavements class, and we visited an asphalt batch plant where the contractor kept touting the benefits of warm mix asphalt. It seems like he was talking about a pavement mix that had larger amounts of recycled product (RAP) that could be placed and made at a lower temperature. How long has industry been using this product successfully? Does it last as long as new pavement that is placed at higher temperature and contains less receycled material?

This example represents multiple information needs articulated under the guise of one question, “What is Warm Mix Asphalt?” Rather than clarify the question in the content copy, the asker instead deviates from the stated aim as indicated by the title by asking additional questions within the content section. This may confuse the reader as to which question to answer and/or overwhelm him/her due to the implied effort he/she must make to answer the question.

Un-relatedness

The un-relatedness category was assigned to multiple, unrelated questions as seen in the following example:

Title: “What does desktop replacement mean?

Content: “Does it mean changing from a desktop to laptop or does it mean desktop to more efficient desktops and/or laptops. In other words do you go from desktop to mobile?”

These questions only appear to be minimally related, which can confuse the reader by masking the intended information need of the asker.

FINDINGS

The main characteristics for the 200 failed questions were spread across the categories with significant concentrations in the too complex, overly broad sub-category (n=68, 34%), followed by lack of information (n=28, 14%), relatedness (n=26, 13%), and ambiguity (n=21, 10.5%) while socially awkward (n=8, 4%), excessive information (n=4, 2%), and poor syntax (n=2, 1%) exhibited a less likely primary influence on failure rate within this sample. Table 4 illustrates the overall results of coding for primary characteristics, which identified reasons for failed questions.

The results indicate a significant proportion of the failed questions from the sample were too complex and/or overly broad (68, 34%). This signifies that a lack of perceived effort on the asker's part to craft a coherent question may cause difficulties in its subsequent interpretation. Moreover, questions from this category often involve topics too complex and/or specialized, which few people could address. In terms of complex questions, some questions require people answering them to have professional, technical, or educational knowledge, information, or experience. Thus, it was found that these complicated questions might have fewer chances of soliciting any answers to satisfy the asker's information needs. In terms of narrowness, a group of questions from this category sometimes ask topics that are too narrow, relating to specific persons, materials, conditions, interests, or groups. These findings indicate that many failed questions actually contained enough information to be answered, but the implied amount of effort the reader must make to answer the question may have been perceived as too great. The finding suggests that the failure rate of fact-based questions can be ameliorated by suggesting that the asker revise his/her question, perhaps by breaking it up into a series of simpler questions or by narrowing the scope of the original question. Another suggestion would be to redirect the asker to a VR service, where a librarian with more comprehensive expertise would be able to address the complexity of the full question. Further study would have to be performed to determine whether this result is generalizable to Yahoo! Answers failed questions.

Table 4. Characteristics (or attributes) of failed questions.
CategoryNumber of questions (%)
Unclear 
  Ambiguity21 (10.5%)
  Lack of information28 (14.0%)
  Poor syntax2 (1.0%)
Complex 
  Too complex, overly broad68 (34.0%)
  Excessive information4 (2.0%)
Inappropriate 
  Socially awkward7 (3.5%)
  Prank12 (6.0%)
  Sloths1 (0.5%)
Multiple questions 
  Relatedness26 (13.0%)
  Un-relatedness17 (8.5%)
Other14 (7.0%)
Total200 (100.0%)

The second most significant attribute of failure is lack of information (n=28). It is unsurprising that people may have a hard time interpreting questions without proper information, since inadequate information increases the chance of potential respondents misinterpreting the asker's intent. Thus, lack of information when asking questions does not strategically facilitate what the asker needs, since it does not properly communicate his/her true information-seeking goal; in addition questions lacking information often discourage responses as they can be perceived by potential respondents as being too complicated to address. One way in which these questions could be addressed is through encouraging micro-collaborations or discussion-based collaboration (e.g. the Wiki Answers model), in which potential respondents could identify the missing information necessary to answer the question, soliciting the asker for further feedback and clarification. This could also assist in maintaining community participation by providing the asker with timely feedback, rather than an extended period of waiting with the potential to receive an inadequate answer to a question that has not been fully fleshed out.

The third significant attribute of failed questions is that multiple related questions are assigned in one body of a question (n=26, 13%), which may cause confusion regarding the asker's intended information-seeking goal. Unlike lack of information, multiple related questions represent the asker's desire to clarify what he/she is looking for in order to satisfy his/her information need. However, it seems that asking more than one question detracts from the opportunity for people to respond, since they must address each question and attempt to translate all of the questions into a single information need; therefore even if all of the questions are somehow related and intended to provide enough information to explicate the asker's information need, multiple questions may conversely impair this understanding. A simple way for this issue to be addressed mechanically via an SQA platform would be to enable machine identification of any content containing multiple questions and feedback generated by the system to the user, encouraging them to reformulate the query to constitute only one question. However, further research should be performed to determine whether the majority of information-seeking questions containing multiple questions fail, or whether it is the characteristics of the multiple questions identified within the typology that lead to failure as opposed to this characteristic, especially since the observed failure rate of un-related multiple questions was so low within the sample.

The last significant attribute of failed questions is ambiguity (n=21, 10.5%); questions that are too vague or too broad may cause misunderstanding regarding their meaning and/or foster multiple interpretations. These questions reveal that lack of a coherent and/or clear manifestation of the asker's information needs discourages responses as people's murky understanding of what the asker is looking for may eventually impair better interpretation of the questions. These questions would best be addressed in a reference environment, where librarians would be able to complete a proper reference interview to assist the asker in translating an inchoate, unexpressed need into a conceptualized query (Taylor, 1968).

Compared to four significant attributes mentioned above, other attributes (e.g., poor syntax, excessive information, socially awkward, prank, and un-related multiple questions) were shown to have a less significant effect. However, there are possible inferences from these minor attributes of failure among information-seeking questions. First of all, we previously described inappropriate questions as the ones that deviate from social norms and/or expectations within the nature of SQA. Inappropriateness, when asking questions on SQA, is one of the less significant factors affecting failure of getting answers, which implicitly means that the users may be aware of which socially appropriate and elaborate questioning behaviors are normative on SQA. Second, we found that excessive information was also less significant on failing to get responses. Morris et al. (2010) found that question length has effects on response helpfulness; question with fewer sentences received more useful responses than questions that have many sentences. However, the finding in our study shows that longer question phrasing that has excessive information does not hamper too much response helpfulness even though we argued that excessive information might require an overloaded cognitive demand on respondents to fulfill an information need so that it may discourage the potential respondents to provide answers. Last, it is an interesting finding that poor syntax is the least frequently found significant attribute of failure among information-seeking questions. This result may reflect that people are familiar with Internet slang or shorthand expressions and/or more tolerant for minor mistakes in writing. Another possible reason is that all questions were coded for the primary attribute. Although many of the questions contain grammatical and/or spelling errors, presence of these errors does not usually inhibit an answerer's understanding of the question.

Since final inter-coder reliability was found to be high for the main attribute (90.5%) along with the exploratory nature of approach, secondary factors have not been included in the initial coding scheme. However, it is important to note that many questions contain multiple attributes that might influence failed question status. Consider the following question:

Where in their room Joy's bed or Mara's bed or Patricia's bed?

The example shows that failure could be attributed to the fact that the question is ambiguous since Joy, Mara and Patricia can describe multiple people. It also lacks information, which might be due to poor syntax, since a subject for the descriptive clause “Where,” is not specified. Questions such as this also were likely to cause disagreement among the coders as to which code should be assigned. This multidimensionality inherent to some of the questions is to be expected, particularly since the categories are not mutually exclusive. However, disagreements over prominence of attributes accounted for less than 10% of the coded questions. This signifies that the coders mostly identified the same predominant attribute of failure for each question from the sample

In addition, some of the 200 questions did not fulfill any of the defined codes developed within the typology (n=14, 7%). These questions may have remained unanswered for other non-textual reasons, such as position within a ranked list of questions or the asker's decision to remove and/or close the question before getting a response. Alternatively, these questions might have contained other attributes that also contributed to its status as a failed question. For example:

What is a drop-in shelter?

This question intuitively appears to be one easily answerable, however, it still failed. Whether non-textual features or secondary factors, such as the fact that no users who viewed the question had knowledge of what a drop-in shelter is, caused this question to fail is an issue to be explored in future works.

CONCLUSION

Even though the study presented here provided insights into identifications of why some questions are failed to get responses on Yahoo! Answers, some questions still remain to be addressed. For example, our findings on analyzing and addressing failed questions on social Q&A were derived from only information-seeking questions, so it is unknown whether these finding are generalizable to other question types (e.g., advice-seeking, opinion-seeking, etc). However, realizing why questions fail could be the first step toward helping information-seekers revise their questions and/or having such questions answered. Identifying what the problem is with a given question could allow systems to fix it, either by requesting the asker to revise the question, by expanding the question, and/or by diverting it to an appropriate person (subject expert) or service. Since there is little in the literature addressing failed questions in VRS or SQA, this typology could be used to better understand why some questions fail. Testing the typology presented here with VR questions could help experts (librarians) better assist end-users by identifying when it is appropriate and how to clarify questions.

One application of the exploratory study described here is to inform the work of librarians who must field complex and ambiguous reference questions on a daily basis. They employ a variety of techniques in order to translate the user's initial statement of an information need into a strong research query that returns relevant results. To accomplish this, librarians in face-to-face and virtual environments rely on a process of clarifying or negotiating the reference question (Ross, Nilsen, & Radford, 2009). This negotiation process is largely absent in SQA, however, perhaps a modification in these systems could be designed that would help to compensate for this absence, or, minimally, provide feedback to allow the user to construct (or reconstruct) a better question.

From the above exploration, the guiding hallmark of a “good” question appears to be one that is clear and complete enough to elicit a satisfactory answer that is relevant to his/her information needs; therefore, this research has the potential to assist reference librarians and other information professionals. If a list of characteristics of failed questions can be further developed and refined, methods to address them also might be suggested, so that reference librarians operating in virtual environments will be provided with a new vantage point for helping online users who may struggle to formulate appropriate and understandable questions.

Another application could be to incorporate relevance feedback within the SQA platform. Relevance feedback represents a self-directed “question negotiation,” of sorts between the system and user. Based on a series of documents, the user is asked to identify which are most pertinent to addressing his/her information need and based on the characteristics attributed to these documents extracted by the system, it returns similar results (Rocchio, 1971). Further development of the typology created within this paper could lead to the development of an operational model defining textual attributes of a question correlated to its primary characteristic within the typology. From here, the system could pinpoint these characteristics and provide the user with iterative feedback of how to reformulate his/her question based on the desire to eliminate attributes that contribute to creating a failed question. Similar to relevance feedback for answers, the system could then provide a list of suggestions for reformulating the question, from which the user could choose, as a means by which to dynamically reinterpret the question to solicit feedback from the SQA community.

Finally, an application of the work reported here could be found in enabling question routing. If one could not only identify the failure of a question, but also the reason(s) behind the failure, and come up with suggestions to address this, potentially another service could be identified that may be more appropriate for dealing with such questions. For instance, if a question on Yahoo! Answers fails due to missing information about its context and usage, one could think of diverting that question to an appropriate VRS, where a trained information professional (e.g., a reference librarian) could help the asker revise the question to fill in the missing pieces to enable the user to receive a quality answer. This work may help in creating synergic solutions with collaborations among various online Q&A services from both SQA and VRS domains.

Acknowledgements

The work reported here is done under research project Cyber Synergy: Seeking Sustainability through Collaboration between Virtual Reference and Social Q&A Sites, funded by the Institute of Museum and Library Services (IMLS), Rutgers, The State University of New Jersey, and OCLC, Inc.

Footnotes

  1. 1

    Social capital refers to the value of social networks, bonding similar people and bridging between diverse people with norms of reciprocity (Dekker & Uslaner 2001; Uslaner 2001).

  2. 2

    See http://answers.yahoo.com/dir/index to browse these 25 categories.

  3. 3

    All examples are verbatim and that errors in grammar and spelling have not been corrected.

  4. 4

    Note the presence of inappropriate syntax in the title, which, while secondary to the main category (complex), still manages to muddle one's interpretation of the question's meaning.

  5. 5

    Cognitive load in this context is defined as the processing and storage requirement imposed on one's short-term memory (Sweller, 1988). It is based on Miller's (1956) work that showed how short-term memory is severely limited, and that the goal of an instruction method or an interface should be to minimize demands on it.

Ancillary