Evaluating the impact of storytelling in Facebook advertisements on wildlife conservation engagement: Lessons and challenges

Social media ad stories are widely used to grow engagement in wildlife conservation. Yet the benefits of different types of story character and content are unclear. In four video stories, we explored the impact of varying the type of protagonist species (Elephant and WildDog) and content about the role of humans in causing wildlife loss (Elephant + HumanAction and WildDog + HumanAction) using Facebook A/B split tests. Counter to prior perceptions that traditional charismatic flagships are more appealing, stories featuring wild dogs—with and without human‐caused harm—elicited higher traffic to a conservation organization's donation website. Only the Elephant video elicited one donation. These results show that storytelling in social media ads, by choosing character and content, can help raise engagement. Yet the failure to raise funds and limitations arising from Facebook's opaque algorithms, underscores the need for greater experimentation to build knowledge about how to convert engagement into donations.


| INTRODUCTION
Conservation organizations increasingly use storytelling in social media advertisements to increase public engagement and donations to wildlife conservation (Toivonen et al., 2019;Waters & Jones, 2011). An estimated 3 billion people actively use social media and Facebook is the largest platform in the world at 2.4 billion users (Ortiz-Ospina, 2019). On these platforms, stories can be shared with target audiences through videos, photos, and articles, more rapidly and at a lower cost than traditional media. Effective stories can drive traffic to conservation websites, as users can actively engage with ad stories by clicking on web links, apart from liking and sharing them across friend networks.
So, how can organizations craft and evaluate effective social media ad stories? Which story characters and content are more beneficial? In seeking to answer these questions, this article connects literatures from storytelling for science and fundraising, conservation marketing, and environmental economics. While informational messages emphasize facts and evidence (Katz, 2013), stories follow a particular narrative structure that describe cause-and-effect relationships between events that take place over time and that impact characters in each context (Dahlstrom, 2014). Storytelling, therefore, is the act of crafting stories by choosing narrative elements like characters (e.g., victims and perpetrators) and content (e.g., causal explanations and the sequence of events in a plot). Stories can be easier to comprehend, more engaging, and can provoke stronger emotional responses (Bruner, 1991;Green & Brock, 2000;Merchant, Ford, & Sargeant, 2010). Yet storytelling is an under-utilized tool to engage nonexpert audiences with complex environmental challenges like wildlife conservation and climate change (Dahlstrom, 2014;Moezzi, Janda, & Rotmann, 2017;Mitchell & Clark, 2021).
A small literature examines if storytelling in charitable giving appeals and green product advertisements increase donation and purchase intentions respectively (Merchant et al., 2010;Moezzi et al., 2017). Another related strand of literature examines if movies affect the demand for wildlife (Megias, Anderson, Smith, & Veríssimo, 2017) and pro-environmental behavior like buying carbon offsets (Jacobsen, 2011). Whether narrative elements in stories, like characters and causal information, impact social media engagement, and online donations has not been explored to our knowledge.
There is less research on the behavioral impact of different types of causes of wildlife loss. The evidence available from environmental economics and psychology shows that people have a higher stated willingness to pay for environmental problems caused by humans rather than natural events (Brown, Peterson, Brodersen, Ford, & Bell, 2005;Bulte, Gerking, List, & De Zeeuw, 2005;Kahneman, Ritov, Jacowitz, & Grant, 1993). This tendency has been termed "outrage effect" because people report feeling more upset about human-caused ecological degradation (Kahneman et al., 1993). Related literature finds that belief in humancaused climate change predicts pro-environmental intentions (Milfont, Wilson, & Sibley, 2017). In the wildlife conservation context, Shreedhar and Mourato (2019) found participants donated more after watching short video stories causally linking wildlife loss to human action like poaching and habitat loss, compared to a control group omitting this causal information. They found that including human-caused harm in videos removed differences in donations to lions, a traditional charismatic flagship, compared to bats. Another study shows that media narratives attributing the cause of COVID-19 to the human destruction of nature increased support for pro-wildlife policies (Shreedhar & Mourato, 2020).
Taken together, these studies suggest that traditional charismatic flagships and human-caused harm to wildlife can increase willingness to support wildlife conservation (e.g., through policy support and intentions) and donations. While constituting an important evidence base, past studies are largely based on observational data from websites, administrative records and existing outreach like magazines (Clucas et al., 2008;Metrick & Weitzman, 1998;Verissimo et al., 2011); stated responses to contingent valuation and questionnaire-based surveys or discrete choice experiments (Bulte et al., 2005;Christie et al., 2006;Kontoleon & Swanson, 2003;Macdonald et al., 2015); or controlled laboratory and online experiments using participant panels who assent to participate in research (Shreedhar & Mourato, 2019Thomas-Walters & Raihani, 2017). These studies do not focus on social media engagement via increasing traffic to conservation organization's website (e.g., link clicks) or online donations, nor the cost-effectiveness of different options. Recent evidence shows behaviors elicited in one context (e.g., in a research lab) need not predict choices in another (e.g., outside the laboratory) (Galizzi & Navarro-Martínez, 2018). The impact of stories featuring traditional flagship species and human action on wildlife conservation engagement and donations in the social media context remains unclear.
This article evaluates the impact of the type of story character and content in Facebook ads on driving traffic via link clicks to a donation website, and donations thereafter. In a pre-registered study 1 conducted in collaboration with the African Wildlife Foundation (AWF), up to 123,563 users in the United States were exposed to one of four videos that varied the protagonist species and human-caused harm to wildlife: a traditional charismatic flagship African Elephant (Elephant), African Wild dog (WildDog), Elephant and human action (Elephant + HumanAction), and Wild dog and human action (WildDog + HumanAction). To the best of our knowledge, this is the first effort to compare how character and content in ad stories impact social media engagement and donations, and the cost-effectiveness of different ads. It adds to the scarce literature evaluating the behavioral impact of conservation campaigns (Batavia et al., 2018;Nelson, Schlüter, & Vance, 2018;Reddy et al., 2017;Veríssimo & Wan, 2019).
Based on the past literature (e.g., Shreedhar & Mourato, 2019), we explore the following: first, stories featuring a traditional charismatic flagship elicits more social media engagement and donations (Elephant > WildDog); second, stories stating that human actions led to the loss of wildlife increase engagement and donations (Elephant + HumanAction > Elephant and WildDog + HumanAction >WildDog) i.e., there is an outrage effect; and lastly, engagement and donations are similar between videos featuring a traditional charismatic flagship and another species when videos include human-caused harm to wildlife (Elephant + HumanAction = WildDog + HumanAction).

| METHODS
The aim was to examine which video story performed best by increasing traffic to AWF's donation webpage (by clicking the link on the post) and donations afterwards. Our benchmark research design was a 2 Â 2 between-subjects randomized controlled experiment to randomly vary story character and content: protagonist species (Elephant, Wild dog) and human action (No, Yes). The study was pre-registered on the Open Science Framework (link).
To implement the experiment in a way that was closest to this benchmark, we ran A/B split tests in Facebook's Ads Manager with via AWF's account. We chose Facebook since the organization wanted to reach a wider audience on this platform (around 72% of Americans are Facebook users; Ortiz-Ospina, 2019). The A/B split test functionality randomly assigns one ad from an ad set (of two or more options) to a users' feed. This allows organizations to pre-test their online campaigns and to optimize advertising expenditures by selecting the ad that elicits the highest click-through rate at the lowest cost. Facebook A/B split tests are like randomized controlled experiments because there is random exposure to manipulated variables at the user-level. Therefore, the main advantages of this approach are that a large number of users are randomly exposed to ecologically valid ad stories in a realworld social media setting, and that the outcomes measured are actual behaviors (Orazi & Johnston, 2020).
We now describe the A/B split test design (see Supporting Information for more details). Since we wanted to drive people to AWF's donation webpage by clicking through a web link, we chose "traffic" as the campaign objective and specified the link type as "webpage" (Supporting Information Figure A3). We selected "creative" to enable A/B split testing of video content through four ads varying two variables to be tested that is, protagonist species and human-caused harm. Facebook's algorithms divided the target audience into "random, non-overlapping, and statistically comparable groups" and randomly exposed users to one video (Facebook, 2021a). We selected the target audience to be U.S. residents, of all genders, aged 25 years and over who were interested in wildlife, and the language as English. This audience was selected as the organization preferred to not to solicit donations from those below 25, given their previous experience that older people were more engaged and likely to donate.
To ensure comparable sample size across treatments, we chose to split the advertisement budget equally across all four videos over the campaign duration by not selecting "Campaign Budget Optimization." As Facebook's algorithms default to spending more on promoting better performing advertisements, deselecting this option helps ensure an even split of the sample of users between each ad. We selected "link clicks" under the optimize ad delivery option, so Facebook could "deliver the ad to the people who are more likely to click on them." We selected the recommended time frame of seven days which has also been employed by prior research (e.g., Matz, Kosinski, Nave, & Stillwell, 2018). We implemented the campaign from April 3 to 10, 2019. We followed a recommended budget allocation that guarantees a total reach of at least 100,000 users (Orazi & Johnston, 2020). Finally, since Facebook's power calculations "suggest the likelihood of a causal result" (based the campaign objectives, budget and other [unspecified] factors specific to test), we chose to have over 95% power to maximize the likelihood of a causal result (Facebook, 2021b).

| Storytelling through videos
We presented the treatments as videos because there is potential to scale their use, and few studies currently evaluate how video story content affects engagement and interest despite their rising popularity online (Knoll, 2016;Waters & Jones, 2011). Videos are estimated to constitute 82% of mobile traffic by 2020 (Facebook, 2021a). They can elicit more attention and emotional engagement compared to photos or text (Gross & Levenson, 1995;Teixeira, Wedel, & Pieters, 2012).
Each video story featured either the elephant or wild dog as the main character. We chose these species based on AWF's fundraising experience, prior research and because they are threatened by common types of humancaused harm. We wanted to check the benefits of continuing to rely on elephants in the context of Facebook advertisements. Elephants are AWF's iconic flagship species (e.g., it is in their logo) and are widely used in online and offline communications like emails and newsletters. We also wanted to trial how to raise the popularity of wild dogs since this species had received the lowest donations on the AWF website. Past research also consistently finds that elephants are one of the most charismatic flagships among western audiences (Albert et al., 2018;Clucas et al., 2008;Macdonald et al., 2015;Verissimo et al., 2011). For instance, in Macdonald et al. (2015)'s cross-country survey (including the United States), participants ranked elephants (Wild dogs) as the second (46th) out of 100 mammalian species. Common types of human activity-unsustainable farming and encroachment into wild spaces-threaten both species in the same region (Blanc, 2008;Woodroffe & Sillero-Zubiri, 2020).
Thus, choosing these species enabled us to hold the human cause and region constant across the treatments while still testing the benefit of featuring traditional charismatic flagships and the outrage effect. However, there are differences in other attributes between both species which can also impact engagement. For instance, elephants are herbivores and are classified as vulnerable by the IUCN, whereas wild dogs are carnivores and are classified as endangered. We do not control these differences in the videos but assume that prior knowledge about endangerment and preferences over species and their attributes are distributed in a similar way across users in all treatments. That said, we cannot rule out the possibility that differences between outcomes reflect the influence of such factors as well. Figure 1 presents a snapshot of the elephant stories (see Supporting Information Figure A5 for wild dog stories). We selected the first photo to feature one elephant calf and wild dog pup since prior work shows that people pay more attention to baby animal pictures (Nittono, Fukushima, Yano, & Moriya, 2012;Yoshikawa, Nittono, & Masaki, 2020). In the Elephant and WildDog treatments, the second photo featured a small group of animals. In the Elephant + HumanAction and WildDog + HumanAction treatments, the second photo featured one animal in focus set against the backdrop of a road (and other out-of-focus animals) with one line of text to indicate how human activity harms the species. We selected a relatively neutral image of the animal against the backdrop of a road to avoid negative emotional reactions to more distressing and shocking alternatives. The last photo consisted of a single adult animal with the donation page link (and was the same in both the Elephant and Wild Dog videos). This photo was also used on AWF's donation webpages (explained below). Since the text in the post (Supporting Information Figure A4) and first (and last) photos in each video were identical, differences in outcomes due to the outrage effect should emerge only if participants were exposed to the second photo in the human action videos.
We followed Facebook's recommendations to capture user attention when designing both the video and the post (Facebook IQ, 2016). For instance, all videos were composed using a controlled sequence of three photos and text, so it was easy to understand the story even without any sound. The text in each frame averaged around 10 simple words to mitigate any information overload. All videos were around 15 s with the same background music. Barring the videos, the ad posts were identical across treatments, that is, all posts were titled with the organization logo and name, and a succinct caption stating the species was threatened and to please donate.
Finally, we included a "Donate now" call-to-action button with a unique web link in each post. If clicked, the web link took users to one of four separate donation pages each corresponding to one treatment group (Supporting Information Figure A3). The webpages were set up by AWF for the purpose and duration of the experiment to link donations to treatment groups. The donation pages were identical across treatments, except that Elephant + HumanAction and WildDog + HumanAction had photos of a single individual set against the road, whereas the Elephant and Wild Dog did not, in line with the videos. While an alternative option would have been to enable users to donate via Facebook, we wanted to trial this format to evaluate how to increase both traffic and donations to AWF's website.

| Outcomes, data, and analyses
We first examine Facebook's reach metric, which is the estimated count of individual users who saw the advertisement at least once on their Facebook newsfeed. It affects other metrics including link clicks. The "impressions" metric estimates how many total times the advertisement was shown to users since individual users could see a post multiple times. Past research has shown people may be more likely to engage if they see posts more often (Doughty, Wright, Veríssimo, Lee, & Milner-Gulland, 2020;Schmidt & Eisend, 2015).
Our main engagement outcomes are the total number of "link clicks" on the ad post (which takes users to AWF's donation webpage; henceforth "clicks") and any donations made on the corresponding webpage. Total link clicks are the count of clicks on links within the ad that lead to a specified webpage off-Facebook (excluding clicks on content or links in other parts of a post).
Facebook considers total link clicks as the main engagement metric since it drives traffic to websites where individuals can then perform a valued action such as donating, and it is an easily comparable gauge of how much interest a post generates in a target audience. We also present data on unique link clicks, which is the estimated number of people clicking the link.
We then consider the cost per click metric, which is estimated by dividing total clicks by the total amount spent. This cost-effectiveness metric enables us to F I G U R E 1 Examples of ad stories: African Elephant treatment videos by frames F I G U R E 2 Estimated total reach and impressions by treatment groups compare performance across videos, and is the basis for Facebook's recommendation of the best performing ad.
In addition, we explore video engagement across groups by considering the number of 3-s video views and average video watch time across groups. Three-second video views are the estimated number of times that each video was played for 3 s or more. The average watch time is the estimated percentage of the video watched.
Lastly, we make a note of post shares and reactions (via emojis and likes), since they are other active online behaviors that can be useful to understand video engagement and how users publicly endorse videos to their networks (Supporting Information Figure A4). 2 Facebook's outcome metrics are estimated either at the aggregated treatment group level, or disaggregated by age and gender groups, through proprietary sampling techniques. Their sampling approach looks at a representative portion of the data and provides similar results to those using the entire sample with high accuracy (Facebook, 2021a). Data further disaggregated at the individual-level (and associated statistical measures like standard deviation) which are linked to user characteristics is unavailable. Thus, we do not have a conventional sample size estimate. Facebook's reach metric comes closest, but it is not a straightforward count since it is estimated from algorithms sampling the data, as are F I G U R E 3 Estimated number of total link clicks, unique link clicks, and video views that were 3 s and over by treatment groups F I G U R E 4 Estimated number of total link clicks by treatment group, age, and gender other outcomes like unique link clicks, impressions, and video engagement statistics. Since the data is estimated at an aggregated level, we cannot use inferential statistical tests that are typically used to analyze individual-level experimental data. 3 Keeping this in mind, our analyses rests on visually presenting the data, comparing descriptive statistics of the outcomes across treatments, and outlining Facebook's test result of the "winning ad," that is, the advertisement with the lowest cost per click. In addition, we check the average effect of each treatment group on the main outcomes by running logistic regressions based on simulated individuallevel data. We created this data set by coding for the treatment group dummy and outcomes based on the estimated reach across groups (Orazi & Johnston, 2020). Data analyses were conducted in Stata.

| RESULTS
Facebook estimated total reach and impressions to be 123,563 and 191,969 respectively. The proportion of reach was 41.6% for women, 12.5% for those aged 25-34 years, 14.3% for 35-44 years, 19.2% for 45-54 years, 27.7% for 55-65 years, and 26.3% for 65+ years. Figure 2 presents reach by treatment. The wild dog videos with and without human action obtained a higher reach than the elephant videos. Both elephant and wild dog videos with human action received a lower reach than videos without human action.
Disappointingly, only one single donation was received from a user exposed to the elephant video. Hence, the rest of this section focuses on differences in clicks. Across all treatment posts, there were 9,181 clicks, and 5,355 of these were unique link clicks from different individual Facebook users. Of the total clicks, 44.7% were from women users; 9.2% from those aged 25-34 years, 10.4% for 35-44 years, 17.5% for 45-54 years, 29.3% for 55-65 years, and 33.6% for 65 years and over. Figure 3 presents total and unique link clicks by treatment. A larger share of total (unique) clicks was observed in the WildDog group compared to the Elephant group: 27.7% (27.8%) versus 21.8% (21%). Videos with the human cause of wildlife loss also elicited more clicks: the WildDog + HumanAction and Elephant + HumanAction groups obtained 22.6% (23.2%) and 27.7% (28%) of total (unique) clicks. There was also a larger share of video views that were 3 s and more in the wild dog and human action groups: 30.2% (29.1%) for WildDog (WildDog + HumanAction) compared to 19.9% (20.8%) for Elephant (Elephant + HumanAction). Similarly, average watch time was estimated to be 35.9% and 32.9% in the WildDog and WildDog + HumanAction groups respectively, compared to 29.8% in the Elephant and Elephant + HumanAction groups.
T A B L E 1 Logistic regression on simulated data: Total clicks, unique clicks, and video views for 3 s and more Outcome variables

Total clicks
Unique clicks

Treatment groups
Coef.

Odds ratio
Coef.   1.198]). The difference between Elephant and Elephant + HumanAction was positive but not statistically significant (p = .101). Wald tests show that the difference between the Elephant + HumanAction and WildDog coefficients was also not statistically significant (Wald chi 2 = 0.03, p = .873). However, WildDog + HumanAction elicited more clicks than Elephant + HumanAction (Wald chi 2 = 4.72, p = .0299) and WildDog videos (Wald chi2 = 4.06, p = 0.044). These results are qualitatively similar for unique link clicks and video views that were 3 s and over, barring the finding that Elephant + HumanAction was higher than Elephant videos (unique link clicks (OR = 1.128, p = .004, 95%CI [1.039, 1.224]) and 3 s and more video views (OR = 1.06, p = .029, 95%CI [1.006, 1.118]). The results on all three outcomes are also similar when controlling for age and gender (Supporting Information Tables A4-A6). The exception is that for total link clicks, the Elephant + HumanAction coefficient is positive and significant (OR = 1.113, p = .010, 95%CI [1.025, 1.207]) compared to Elephant videos. Users aged 45 years and over were more likely to click links and watch the videos than those aged 25-34. In sum, these results lend further support to findings from the descriptive analyses that users were more likely to click on wild dog videos compared to elephants, especially if the former had content on humancaused harm.
The average cost per click is lower for the WildDog + HumanAction and WildDog groups at $0.19 and $0.20 per click, respectively, followed by Elephant + HumanAction and Elephant groups at $0.25 and $0.24. Facebook selected WildDog + HumanAction as the "winning ad" was since it was the most costeffective at eliciting clicks.
We now turn to differences in engagement by age and gender. Figure 4 shows that in all age groups, wild dog videos elicited more total clicks amongst men compared to the elephant videos. Men also clicked more when exposed to wild dog videos compared to women across age groups (but especially those below 55 years). However, the number of clicks between WildDog and WildDog + HumanAction videos are similar amongst men. On the other hand, older women (especially those over 45 years) exposed to any elephant videos clicked more than women exposed to WildDog videos. These women were also more responsive to content about the human-caused harm when they featured elephants, since Elephant + HumanAction videos elicited higher clicks than Elephant videos. Similar patterns are observed when we consider unique link clicks and video views 3 s and over (Supporting Information Figures A8 and A9). These results are further supported by logistic regressions based on the simulated data (Supporting Information  Tables A7-A9).
We do not have information about what proportion of users within each group saw either the entire video or specifically content about human action (which would F I G U R E 5 Estimated percentage of the video watched by treatment, age, and gender have occurred around 33% into the video). We do, however, have the estimated average watch time by treatment, age, and gender ( Figure 5). It is evident that those over 55 years had longer watch times on average (typically over 33%) and those in the 25-34 age group, the lowest (below 30%). Males in all age groups watched wild dog videos longer than elephant videos.
Lastly, the reaction and share engagement metrics present a more mixed picture (Supporting Information Figure A4). The Elephant group elicited more reactions (226 vs. 142 emojis) and 25 more shares than the WildDog group. The Elephant + HumanAction group elicited more emojis (but not more shares) than the Elephant group. On the other hand, the WildDog group elicited more reactions and shares than the WildDog + HumanAction group (142 vs. 122 emojis) and marginally fewer shares. All posts got a similar number of comments. These differences are minor considering number of users reached.

| DISCUSSION
This article presents findings from an effort to evaluate the behavioral impact of storytelling in Facebook advertisements on social media engagement and online donations. Specifically, we explored the effect of varying the type of story character (elephants, a traditional charismatic flagship, compared to wild dogs), and content (human-caused harm to wildlife). Both protagonist characters and content about cause-effect relationships are crucial narrative elements of storytelling. Facebook's winning ad was WildDog + HumanAction, followed closely by WildDog.
Some of the strengths of this study are that we tried to evaluate the impact of novel, ecologically valid story characters, and content in a real-life social media context on revealed engagement behaviors like link clicks using Facebook A/B split tests. It was pre-registered and codesigned with a wildlife conservation organization. Yet there are also several challenges given the failure to raise donations, and limitations stemming from experimental design choices and the lack of transparency associated with Facebook's algorithms. To overcome these limitations, we distill several lessons that can be helpful to guide conservation scientists and practitioners in the future.
First, organizations can leverage storytelling by carefully choosing the protagonist species, evaluating how they perform, and feeding back the results into their communications strategies. Both wild dog stories outperformed elephants, a traditional charismatic flagship, by obtaining more clicks. Like this study, Veríssimo et al. (2017) also found benefits from placing less popular species rather than charismatic flagships at the top of conservation organization websites. Taken together, these results suggest that relying solely on traditional flagships may be less fruitful in online and social media contexts. While there may be several reasons for this, one explanation is that elephant stories elicited lower responses because of "flagship fatigue," that is, traditional flagships may have been less attention grabbing since they are very familiar (Bowen-Jones & Entwistle, 2002). Since web platforms are often designed to capture user attention through promoting novel stimuli (Lorenz-Spreen, Lewandowsky, Sunstein, & Hertwig, 2020), traditional flagships may not always be eye-catching and consequently be less promoted. To this point, we found that impressions were higher for all wild dog videos, indicating that they were shown more often on people's feeds (because they were stopping to see the ad in the first place), which in turn could have led to more link clicks. A positive implication of this result is that organizations could diversify the range of species used online, for instance by also selecting high potential "Cinderella species" (Smith et al., 2012) alongside traditional flagships. Not only could this approach lead to greater engagement, but it also has the potential to simultaneously address concerns that solely focusing on traditional flagships may lead to unintended consequences like creating the public perception that other species are unimportant (Douglas & Winkel, 2014).
Apart from the type of species, scientists and organizations may benefit from evaluating how to best portray them. We used different types of images in the stories including those featuring single versus many individuals to paint a realistic portrait of each species. Past studies, however, have pointed to an identifiable victim effect, that is, tendency to donate more when informational appeals feature a single identified human victim (Jenni & Loewenstein, 1997). There is mixed evidence about nonhuman victims: Thomas-Walters and Raihani (2017) do not find significant differences in actual donations to one versus many animals whereas Markowitz et al. (2013) found only people who do not self-identify as "environmentalists" state that they give less to appeals with many animals. We used a photo with multiple individuals in the second frame in videos without human-caused harm, and photos with one in-focus individual with other outout-focus individuals in the human-caused harm videos. Therefore, it is possible that the differences that we observe across stories may be partly driven by the number of individuals in the photos, apart from content about humans. Future studies could systematically examine the relative benefit of stories featuring one identifiable protagonist throughout the story by developing its character is greater detail (e.g., by naming and anthropomorphizing them, describing their personal and social lives), and then requesting donations for that individual. This approach is also supported by studies from impact philanthropy, which find that people are willing to sponsor one child though school rather than donate school supplies to the entire class (Amos, Holmes, & Allred, 2015;Duncan, 2004). They can also test for the identifiable victim effect amongst nonhuman victims.
Second, apart the protagonist species, story content about the causal explanation for their plight can also be important. Clarifying human-caused harm can be beneficial, especially when added to wild dog videos, possibly because this content elicits greater outrage and makes beliefs about anthropogenic environmental change salient (Kahneman et al., 1993;Milfont et al., 2017;Shreedhar & Mourato, 2020). Like in our stories, such causal explanations typically appear in the middle of the narrative arc in popular stories (Boyd, Blackburn, & Pennebaker, 2020). However, a challenge of placing such information later in video ads could mean a smaller proportion of people-most likely older users-watch this content. Indeed, since we observe that older users watched videos for longer, it is likely that they may have driven differences in outcomes due to human-caused harm. So, experimenting with ways to bring such information up front (e.g., through changing the plot and visual imagery), along with the length of the video itself to appeal to more people (including younger users), is a promising line for future research on storytelling.
More broadly, the question of how best to trade-off narrative elements like the suitable timing and sequencing of events, and overall video length is especially important going forwards. The pace of new (and personalized) information available to users on social media platforms can lead to information overload, which can crowd out the attention and time that people choose to allocate to each piece of content (Lorenz-Spreen et al., 2020). Users make quick decisions about whether to continue watching the ad, click the link, or scroll past. Furthermore, what an optimal story is for Facebook may differ from what is best for YouTube (which have longer videos), Instagram, and TikTok (which have short videos with different formats and attract younger users). What an effective story is, therefore, can depend on the purpose, design and audience of a given social media platform.
A possible downside of attributing wildlife loss to human action is that people could systematically avoid such stories (and implicitly the cause and organization) because it induces negative emotions (Golman, Hagmann, & Loewenstein, 2017). If avoidance is along political group lines or prior beliefs, an unintended consequence is that such stories could contribute to polarization, since ads will be optimized by appearing more frequently to those engaging who already believe in anthropogenic environmental change. That said, recent studies show most Americans (including Republicans) believe that human-caused climate change is happening (Ehret, Van Boven, & Sherman, 2018). Organizations could consider incorporating strategies like group consensus messages (van der Linden, Leiserowitz, Feinberg, & Maibach, 2015) or leveraging superordinate identities (Iyengar, Lelkes, Levendusky, Malhotra, & Westwood, 2019) to address risks of polarization.
A third and related lesson is that some stories may be more appealing to certain population sub-groups, so organizations can tailor types of character and content to specific groups to raise engagement. For instance, we found that men were more engaged with the wild dog videos. This could be because wild dog videos may have catalyzed norms and values linking masculinity and dogs, like past ads commonly featuring dogs as male companions in outdoor settings (Hirschman, 2003). Future research and campaigns can test these aspects more systematically by matching species and story content to target groups based on different dimensions such as demographic profile, preferences, and activities online, and even attributes of their social networks, while being careful to not exacerbate group polarization in ways suggested above. Along these lines, for example, Matz et al. (2018) found that persuasive appeals that were matched to people's extraversion or openness-to-experience level resulted in more clicks and purchases than their unpersonalized counterpart.
The fourth lesson is that clicks on Facebook need not equal donations off-Facebook. Some likely reasons are the lack of repeated exposure from multiple channels (e.g., Facebook, Instagram, and Google) and "friction" costs, that is, the additional time and effort to donate after being directed to AWF's donation page. To the last point, once users clicked the "Donate now" button, they would have been taken to a AWF's donation webpage on a different tab in their browser. There, they would have had to choose an amount from a pre-defined list or enter an amount of their choosing, along with their payment details, and then click pay. 4 Another reason could be the campaign objective and resultant sample of users itself. More specifically, we designed the campaign to deliver ads to Facebook users prone to clicking links and it is possible that these users are not prone to donating.
To address these challenges, conservation organizations could specify one target behavior of the Facebook advertisements campaign (e.g., to increase traffic or donations) and then examine which tools, settings and audiences are most appropriate within a given budget. For instance, if the main goal is donations (rather than traffic), then the ad campaign could be designed keeping in mind the need to reduce friction costs by using the donation options already offered on the platform (e.g., Facebook Fundraisers). Alternately, organizations could trial Facebook "conversion lift tests" with "pixels"-a type of split test using cookies to link individual's actions taken on a website off-Facebook (typically purchases) back to ads. When using such tools, however, organizations should be transparent about the goal of the ad and the privacy implications, and adhere to ethical design standards (e.g., having explicit statements with opt-in options following General Data Protection Regulation [GDPR] to approve cookies).
A fifth lesson is that the A/B split test functionality on the Facebook ad campaign platform can be useful to evaluate which stories can appeal to different users because it is designed to be a randomized control experiment (Matz et al., 2018;Orazi & Johnston, 2020). Therefore, advertisers can select the most cost-effective ad depending on their campaign's goals (e.g., the lowest cost per click). That said, the findings may be very sensitive to the campaign's objective and settings. For instance, although we set the campaign budget to be equally proportioned across ads to ensure equal sample sizes across groups as recommended by Facebook, we found that reach still differed across groups. This is likely because we set the ad delivery to be optimized amongst those most likely to click links. It is possible that the differences in reach would have been even higher in the absence of equally diving the ad budget. Future studies can trial Orazi and Johnston (2020)'s suggestion to set delivery optimization to "Reach" in addition to equally dividing the budget across ad sets.
Although A/B split tests can reveal which ads are more effective at a point in time, they may not always reveal the complete picture about why patterns emerge. It is difficult to empirically rule out the possibility that other types of content simultaneously promoted in user's news feeds (e.g., product placement ads) did not interact with the videos. This may be the case if confounding content was promoted based on a common underlying characteristic (e.g., following environmental organizations) that also drives video engagement and link clicks.
That said, A/B split test groups ought to be comparable to each other, at the very least based on audience characteristics that we specified (i.e., U.S. residents aged 25 years and over who speak English and have interest in wildlife). While some argue that the split testing "largely eliminates the influence that optimization algorithms have on the delivery of test ads" and that it is a "robust way to run experimental designs in a naturalistic, online field setting" (Orazi & Johnston, 2020;pp. 190), we cannot empirically verify that there was balance on these and other relevant characteristics across test groups, a strategy typically employed in randomized control field experiments. Other papers have also debated these and other methodological limitations and evolution of Facebook's testing procedures (Chawla & Chodak, 2021;Eckles, Gordon, & Johnson, 2018;Gordon, Zettelmeyer, Bhargava, & Chapsky, 2019;Matz et al., 2018).
Researchers will have to check for the latest types of tests and procedures (including available user attributes that may be useful for targeting ads) since Facebook constantly modifies its platform and policies. Details about outcomes, the user sample, and split testing procedures and policies can be inadequate and dynamic, as are the range of confounding factors arising from Facebook's murky algorithms. Such information can be instrumental in understanding why and when story character and content are more impactful. Yet this information is not currently available for organizations on a tight budget and researchers working outside Facebook. Such procedural changes can also have implications for how externally valid findings are.
Collaborations between researchers, conservation organizations, creative storytellers, and social media platforms in accordance with transparent ethical and privacy standards would be valuable to move forward and keep pace with the latest changes. Such collaborations can enable the design of story stimuli that are more compelling, theory-driven, and ecologically valid, and the adoption of appropriate and transparent split testing procedures to make engagement and fundraising efforts more effective. Ensuring such collaborations combines stringent ethical standards and regulatory oversight, with a clear code of practice for both researchers and social media platforms, is key. One such example is the FORGOOD framework, an appraisal tool for behavior change interventions which is based on seven ethical dimensions: Fairness, Openness, Respect, Goals, Opinions, Options, and Delegation (Lades & Delaney, 2020). In the meantime, however, complementary investigations via experiments and quasi-experimental techniques remain necessary to unpack these aspects from a scientific perspective and mobilize much-needed resources for wildlife conservation.

ACKNOWLEDGMENTS
My sincere gratitude Davide Onate, Philip Muruthi, and Fiesta Warinwa from The African Wildlife Foundation for supporting this project, and a very special thanks to Gayane Margayan and Brett Nolan co-designing the stimuli and implementing the campaign on Facebook. Many thanks to comments and discussions from the article's editor, the three reviewers, Mark Schwartz, Susana Mourato, John Vucetich, Hunter Doughty, Thomas Leeper, and Michael Bode. This project was funded by the London School of Economics and Political Science Knowledge Exchange and Impact grant.

DATA AVAILABILITY STATEMENT
The experimental materials, data and code are available on the Open Science Framework at this link https://osf.io/ rczt9/?view_only=9ba9939e9693430ba9621e052a549909.

ETHICS STATEMENT
This study complies with LSE's ethics review.
ORCID Ganga Shreedhar https://orcid.org/0000-0003-2517-2485 ENDNOTES 1 Preregistration is the practice of documenting your research plan and hypotheses at the beginning of your study and storing that plan in a read-only public repository to improve quality and transparency (learn more here). 2 We did not analyze differences across emoji types as people use emojis in subjective ways. 3 We took several measures to increase the transparency and rigor of this study through pre-registration plans. However, as noted by Nosek (2018) "even the best laid plans are difficult to achieve" and deviations from data collection and analysis plans are common. We detail deviations from the pre-registration plan in the "Transparent Changes" document on OSF.