Overcoming unreported violence using place‐based ambulance data: The case for mapping hotspots based on health data for crime prevention initiatives

A key concern in crime analysis is the “hidden crime” problem. Crime events unaccounted for in police records limit the external validity of official statistics and, more importantly, hinder the ability of the police to manage crime and utilize their resources effectively. The problem is exacerbated in proactive initiatives aimed at curbing violence through hotspot policing, where inaccuracies and imprecision, or, worse, no data at all, diminish prevention efforts. Previous studies have sought to overcome the data problem by juxtaposing police records with ambulance data on assault callouts and have found profound disparities. Specifically, researchers matched “crime hotspots” with “ambulance hotspots” (rather than individual events) because patient confidentiality considerations have prevented health professionals from sharing subject‐level data with the police. However, health services can safely share spatial data on wider areas that do not disclose personal information. We build on this line of inquiry by analyzing data from the Thames Valley, United Kingdom, and observing spatial hotspots of different sizes. The results demonstrate that while the police and ambulance services attend to the same communities and similar types of facilities, the police are “blinded” to the location of nearly 8 out of 10 assaults. The incongruency is shown even with severe assaults, but to a lesser extent. We then simulate the reduction in injuries if the police had access to health data at different spatial levels and show that even under the most conservative set of assumptions, such an approach can prevent between 113 and 116 violent injuries each year that might otherwise require hospitalization.


| INTRODUC TI ON
Violence has risen in most Western societies since the end of the COVID-19 lockdown regimes (Bates, 2022).However, the cause of this increase remains analytically challenging, given a variety of contributing factors-chief among the "known unknowns" is the lack of valid police data on the hidden figures of crime: not all crime is reported to the police, with the scope of unreported incidents equal to up to half of all injuries involving violence (Wu et al., 2019).At least some of the unaccounted incidents are reported to social agencies, such as schools, social services, and health professionals like emergency departments, paramedics, and community health providers (Ariel et al., 2015;Boyle et al., 2013;Hibdon et al., 2021;Sutherland et al., 2021).Thus, to better understand the recent spike in violence-and crime patterns more broadly-police records must be supplemented with additional data sources.
Previous studies have assessed the similarity of crime "hotspots" generated by either police records or ambulance callout data.This methodological approach is required because subject-level data cannot be shared between health professionals and the police due to patient confidentiality restrictions.However, information can be shared at more expansive spatial levels that retain the anonymity of the victims but remain valuable for policing efforts.
We capitalize on this body of research and add three additional layers of analysis to this growing body of literature on interagency data sharing.First, we juxtapose the data at different spatial layers to further explicate the degree of incongruency between the two datasets.Second, we observe the type of facility where these incidents occur to better understand the situational circumstances responsible for crime concentrations and reportage patterns.
Finally, we simulate the potential for violence prevention through data sharing at the hotspot level based on the data made available to us by the ambulance services and the available literature on the magnitude of the effect of hotspot policing in reducing crime and, particularly, violence.

| The problem of underreporting crime to the police
A sizeable proportion of violent crime does not appear in official crime statistics (Mayhew & Van Dijk, 1997).
When it comes to the police, crime is underreported, undermanaged, or both (Ariel et al., 2016).The "dark figures" are difficult to estimate, but-based on comparisons between national household surveys and police recordsunderreported crime can be more than 50% for some violence (Ariel & Bland, 2019).Improving reportage is thus of utmost importance, and improving crime reporting behavior is a top priority for policing organizations worldwide (McKee et al., 2022).However, practically speaking, supplementary datasets are required for scholars and practitioners to validly and more accurately observe crime patterns.Some crimes will likely be reported to health professionals, schoolteachers, social workers, and community pastors, but not the police.There is a range of reasons for alternative reporting of crime events, including lack of trust, not wanting to trouble the police with what the victim construes as "low level" offending, not knowing with severe assaults, but to a lesser extent.We then simulate the reduction in injuries if the police had access to health data at different spatial levels and show that even under the most conservative set of assumptions, such an approach can prevent between 113 and 116 violent injuries each year that might otherwise require hospitalization.
the incident was a crime, unwillingness to get the offender in trouble, or fear of retaliation (Langton et al., 2012;Moreau, 2020;Xie & Baumer, 2019).Consequently, some victims may share details about the incident with "alternative sources."Some of these services maintain datasets of these reports, while others, like pastors and rabbis, attempt to resolve the problems using their unique sets of skills: mental health professionals can aid victims in the aftermath of their victimization, schoolteachers can capitalize on the harmful incident to educate pupils on the wrongfulness of harm, and social workers can support the victim and the offender by directing them to community services.However, none of these groups is obligated to maintain a dataset of the events.
The efficacy and cost-effectiveness of supplementary interventions aside, many have called for more robust data sharing between the police and these alternative bodies (Balfour et al., 2022;Murray et al., 2021;Schmit et al., 2019).
The police remain the primary social institution that deals with crime, so more should be done to bring crimes to their attention.While data protection guidelines and patient confidentiality remain critical tenets of health and social services, the need to share information with the police is equally important if a society is interested in bringing offenders to justice and preventing further criminal behavior (Florence et al., 2011;Kennedy et al., 2023). 1 As importantly, patient confidentiality can be preserved when aggregated or non-disclosed information is shared with law enforcement.For example, many important variables can be shared without disclosing the identity of the victim: time, space, characteristics of the offense, the characteristics of the offender, the setting of the offense, weapons used, the facility where the event occurred, and the postcode/street where the event occurred; these and many other factors that do not disclose the identity of the complainant nor the transgressors can (and are) used for crime prevention initiatives, disorder management, and law enforcement more broadly (see review in Pickering & Fox, 2022;Taylor et al., 2016).

| The use of health services data with police
Here we make a case for more information sharing between health service workers and police, which is a growing area of research.Local accident and emergency (A&E) attendance, ambulance service callout, and hospital admissions data can be used strategically to monitor trends and target violence within local areas (Upton et al., 2012;Wood et al., 2014).Shepherd and Warburton (2004) highlighted the effectiveness of anonymized information sharing and the combined use of health service, police, and local government partnership data for preventing violence-related injury.At one emergency hospital in North-West England, Quigg et al. (2011) found that approximately 25% of assaults reported to hospital staff had not been recorded by the police.Similar findings and conclusions were made by Gray and Walker (2009) in their examination of stabbing and gunshot incidents in South Yorkshire, UK, as well as a study by RAND Europe regarding the value of ambulance data in injury surveillance in West Midlands, UK.The latter found that between 66 and 90% of ambulance incidents did not appear in police data (Sutherland et al., 2017).This underrecording of police data is crucial as it indicates the potential use of health datasets to form a more comprehensive analytical picture of crime and aid in improving ancillary harm prevention practices.

| The use of ambulance callout data
A significant proportion of assault victims are not recorded by the emergency department but are recorded by the ambulance service (Taylor et al., 2016).Ariel et al. (2015) found that 30% of ambulance callouts for assaults in Cambridge, UK, did not result in hospital transfers and therefore did not include A&E data.Relatedly, Downing et al. (2005) analyzed data from the ambulance and emergency services department in West Midlands, UK and determined ambulance data to be a significant indicator of violence, even more than hospital records: 15.8% of ambulance assault records were not linked to a hospital ambulance arrival and thus did not result in the victim being transferred to a hospital.Finally, Quigg et al. (2011) amassed over 30,000 violence-related calls from the Northwest ambulance service in the UK.Of the violence-related callouts, 41% were not referred to a A&E for further treatment.Thus, ambulance data provide important information about assaults that are not seen by other healthcare professions (see also Carter & Benger, 2008;Young & Douglass, 2003).Some studies have also observed the utility of incorporating ambulance data in testing policing interventionsthough, as far as we can tell, there are no rigorous tests of law enforcement experiments that use health data to formulate policing tactics.Masho et al. (2014) used ambulance surveillance data to analyze the number of violence-related "pick-ups" around convenience stores.These stores were then subject to restrictions around alcohol sales, and ambulance data was used to assess the effectiveness of these restrictions.The researchers found a reduction in violence-related pick-ups (from 19.6 to 0.0 per 1000) in the intervention group compared to the control group (from 7.4 to 3.3).However, these data are restricted to specific locations and only within the unique settings of alcohol restriction rather than violence per se.

| The juxtaposition of "crime hotspots" and "ambulance callout hotspots"
Criminological studies of spatial crime patterns have determined that crime is not randomly dispersed throughout a large geographical area but is concentrated in some areas while absent in others (Chainey et al., 2008).Moreover, crime events cluster in a few places and are similarly committed by and within specific demographics and place types (Bergman & Andershed, 2009;Jennings & Reingle, 2012).In one study, for example, Sherman identified that 3.3% of all places generated over 50% of calls for police attendance (Sherman et al., 1989).This precise identification of clustered crime events allows academics and practitioners to utilize the concept of hotspot policing to design and implement an evidenced-based operational reaction to an identified problem (Sherman et al., 2014).For example, in a study of drug arrests in New Jersey, Weisburd and Mazerolle (2000) found that approximately half of all drug arrests and 40% of disorder-related arrests were generated at a mere 4% of all street intersections.In another study of crime in UK underground train stations, Ariel (2011) established that the same train stations repeatedly accounted for the highest crime levels over 5 years.This study offers meaningful evidence that crime is concentrated at certain facility types that are publicly accessible.Ariel et al. (2016) conducted a 12-month comparison of calls received by both ambulance and police services regarding violence in Peterborough, UK, to determine the nature of spatial-temporal crossover between these data sources: "crime hotpots" were found to be correlated with ambulance calls.However, the authors argued that scholarly attention should not be focused on the overlap between the data sources but rather on the extent of the underlap.Indeed, one-fifth of ambulance hotspots were not recognized in the police data.
Despite the accumulated evidence, some fundamental questions remain largely unanswered.First, what is the degree of overlap between violence reported to the police and injuries based on ambulance callouts at different spatial measurement levels?Second, how much violence is "hidden" from police records but appears in health services datasets, and vice versa, when observing the data at certain types of facilities?Third, what are the implications of these findings for crime prevention, given the dark figures of the crime problem?This article attempts to address these issues through observational data and simulations of the potential benefits of this approach to inform future field experiments.

| Setting
The study area is the Thames Valley, a jurisdiction in the United Kingdom covering the counties of Berkshire, Buckinghamshire, and Oxfordshire.The geographical area is a blend of rural and urban zones, with the Thames Valley Police responsible for one of the largest territories in England, covering 2200 square miles (5700 km 2 ).

| Police-recorded data
This study utilized 12 months of police-recorded violence data on injury crimes (n = 12,654).Violence with injury is a Home Office counting rule category comprising 13 sub-classifications, and analyses will include only those crimes categorized as assault occasioning actual bodily harm, grievous bodily harm, and attempted murder.The rationale behind the exclusion of violence without injury rests on the fact that violent crimes where no injury is caused are unlikely to require an ambulance crew.
These data contain information on the date and time of the offense as well as the exact location as represented by spatial coordinates (Easting and Northing).Within the British national grid reference system, Easting and Northing are geographic Cartesian coordinates for a specific point.Easting refers to the eastward-measured distance (often referred to as the x-coordinate), while Northing refers to the northward-measured distance (the y-coordinate).The coordinate pair is commonly measured in meters from a horizontal datum.The Eastings are the coordinates that stretch along the x-axis on the map, while Northings stretch along the side y-axis.These data were geocoded using ArcMap.

| Ambulance data
The study uses population-level data on all ambulance callouts where any derivative of assault was mentioned in the electronic patient record during the same time period from South Central Ambulance Service (n = 1244).South Central Ambulance Service (SCAS) is the authority responsible for providing National Health Service (NHS) ambulance services in the counties of Buckinghamshire, Oxfordshire, Berkshire, and Hampshire.Finally, these data contain information on the date and time of the offense as well as the exact location represented by the Easting and Northing, which enables us to juxtapose the two datasets.
SCAS provides traditional 999 emergency services, non-emergency patient transport services, NHS 111 services and logistics, and commercial and training services to a population of over 4 million people.Like Thames Valley Police, all calls to SCAS are handled by an operator who asks a range of questions to determine the response required.Data are collected on the computer-aided dispatch (CAD) system, and an electronic patient record (ePR) is created upon attendance.The ePR, which contains significantly more detail than dispatch data about the nature of the injury sustained and, more importantly, how the injury was sustained, is replicated and stored in a reporting database.
The collected data from SCAS included assault data "only" based on the information within the electronic patient record.Due to data limitations, the assault data were extracted using a free text search function and limited to a period of 12 months.Prior to the last 12 months, the data existed on paper records only, and consequently, automated bulk data mining was not possible.The free text search utilized words associated with assaults and violence, including all derivatives of the words "assault" and "violence" in addition to words such as "punch," "stab," and "fight."The internal validity of using a free text search method for this data extraction may be of concern and is worth highlighting as a limitation.Information on the nature of the injury sustained and the name of the hospital personnel was also recorded.

| Analysis
We achieve our research aims by using several methodological techniques.The police data were imported into GIS software (ArcMap) using the Eastings and Northings to create a pin map displaying the location of the crime.A boundary layer file was added featuring middle layer super output areas (MSOA), and a spatial join was conducted to count the number of crimes falling within the boundary of each MSOA.Ambulance data were plotted similarly, and a table was created displaying both the count of police and ambulance data for each MSOA code in the Thames Valley.The same process was carried out using the 12 Local Police Area (LPA) boundaries for the ambulance data only because police data already contained details of the LPA where the crime occurred.Finally, a mapping layer was created using the "fishnet" functionality in ArcMap to allow a count of police and ambulance data in a 100 m 2 grid for the whole of the Thames Valley.
The data were plotted as a scatter graph, and a linear trend line was added.The R 2 was calculated to determine the relationship between ambulance and police data at the MSOA, LPA, and 100 m 2 grid levels.We hypothesized that this method would not only determine the strength of the overall correlation but indicate the areas of the Thames Valley where both agencies were called.These data were also analyzed and coded to attribute each event to a facility type.It is envisaged that the facility-level correlation analysis may indicate location types where ambulance crews are called relative to police data.
The analysis of the overlap between police-recorded violence and ambulance callouts to people requiring treatment for violent injury was carried out using a sensitivity analysis.The matching variables consisted of time and location but used three spatial-temporal levels: (1) 20 m and 60 min, (2) 100 m and 120 min, and (3) 100 m and 720 min.At each sensitivity level of distance and time, a count of ambulance records was collected where the sum of the difference between SCAS Easting and TVP Easting and SCAS Northing and TVP Northing was less than the specified distance or within the time parameter.
The method of using the sum of the difference between SCAS Easting and TVP Easting and SCAS Northing and TVP Northing is crucial as it negates the use of mapping software, which some police forces lack.Additionally, incident data were processed to ensure that events that occurred within the time parameters but across different dates were captured, thus negating the issue of an incident in one dataset occurring on Saturday at 23:59 and another at 00:01 on Sunday.Furthermore, we calculated the percentage of all ambulance incidents that matched the police data, which can be performed for each spatial-temporal level to ascertain the extent of unreported violence (defined from this point on as "ambulance only").In addition to the spatial-temporal matching of all ambulance callouts to violent injury reports, it is possible to explore the severity of unmatched ambulance data by isolating the records where the patient sustained assault-related injuries deemed severe enough to warrant conveyance to an accident and emergency department.

| Simulation of the benefits of utilizing ambulance data in hotspot policing
We are unaware of any partnerships in which the police use data from external agencies, such as health services, to create crime hotspot maps and then employ a preventative strategy to "cool down" these areas.We are unaware of any experiments that have evaluated this method either (however, cf.Ariel et al., 2017 andmore recently Ariel, 2023;Ariel, Harinam et al., 2023).In an effort to formulate hypotheses for a future test, we simulate the potential effect of this approach using the data from this study and the available evidence on the impact of hotspot policing in order to develop testable hypotheses.
We consider two possible scenarios.Scenario 1 includes hotspot policing strategies without ambulance data, while Scenario 2 includes hotspot policing strategies with ambulance data.In both scenarios, we assume that typical resources will be available to manage 50% of street violence hotspots, which research indicates range from 1% to 14% of hotspots (Weisburd, 2015).Therefore, we assume three possible distributions: 5%, 10%, and 15% of hotspots that account for 50% of police-reported violence.We consider the 50% threshold as multiple studies in criminology have used this benchmark (e.g., Sherman et al., 1989).
We then extrapolate a set of evidence-based expectations from a recent agent-based modeling study on robbery (a primary violence crime category).Weisburd et al. (2017) conducted a simulation with two tests; in each test, they assumed either low or high police intensity in the hotspots based on the readily available resources in a typical police department (see comparable simulations in Malleson (2011) for burglary and a review of this approach in Groff et al. ( 2019)).They also conducted analyses at units of different spatial sizes: borough, beat, and hotspot, which correspond to our three units-local policing area, middle super output layer, and hotspots of approximately 100 m.However, we will focus on the hotspots.
The first hypothesis examines whether or not violence rates are lower across geographic units when police patrols are implemented as opposed to when there is no police patrol.The second hypothesis examines whether violence rates are lower across all geographic units of analysis when police patrols are utilized as opposed to random police patrols, that is, the certainty of police initiatives.We assume that the police can focus on hotspots that experience about half of all violence-related injuries (Lee et al., 2017;Sherman et al., 1989;Weisburd, 2015) and consider all possible situations, ranging from 1% (n = 5 incidents) to 100% (n = 499 incidents) of unaccounted violence, concentrating on the hotspots identified by ambulance callout data.We then estimate the treatment effect by drawing on Weisburd et al.'s (2017) policing scenarios and the available resources, comparing the two possible control conditions mentioned above.

| MSOA and LPA levels
In large spatial areas, strong correlations were detected between police and ambulance assault data at the MSOA level (r = 0.717) and the wider LPA level (r = 0.797).As shown in Figures 1 and 2, the relationship indicates that police and ambulance services are being called to violent incidents in similar geographical census areas.While there is a strong relationship, there are outliers: two MSOAs in Milton Keynes and one in Oxford.

| 100 m 2 grids
On the other hand, there is a far weaker relationship at the 100 m 2 grid level between police-recorded violent crime and ambulance assault data (r = 0.019).This result is not surprising given the small geographical grid size and the stochastic spatial nature of violence (see Deckard & Schnell, 2022;Harinam et al., 2022)-but mainly when the number of ambulance callouts relative to police data is approximately 1:10 (Figure 3).

F I G U R E 1
Correlation between police-recorded violent crime and ambulance callouts to violent injury in the middle super output areas (MSOA) in the Thames Valley (1/07/16-/30/0617).

| Facility level
The results show a strong relationship (r = 0.893) at the individual address level, which indicates that the police and ambulance services attend similar facilities.However, the results (Figure 4) identify an interesting outlier: higher levels of incidence of ambulance callouts compared with crime at "pub/restaurant and entertainment venues," which experience far less recorded crime compared to ambulance callouts.

| Individual incident level
The overlaying of data reveals that a considerable proportion of ambulance callouts to individuals who have suffered violence-related injuries is not visible in police-recorded crime data.A multi-stage sensitivity analysis was designed using three spatial-temporal levels.At each sensitivity level of distance and time, a count of ambulance records was collected where the sum of the difference between ambulance callout Easting and Northing and police Easting and Northing was less than the specified distance.Table 1 shows the number of matching and non-matching ambulance records within the police dataset at each sensitivity level: 85.4% (n = 1062) of ambulance callouts to people who have where police focus on only 5% of the distribution-for example, a single hotspot with 25 assaults per year, a concentrated effort can prevent 19 injuries.

| D ISCUSS I ON
The findings indicate that ambulance callouts to violent injuries and police-recorded violent crime are generated from the same large geographical locations-either entire police areas or at smaller MSOA levels.Furthermore, it is evident that both emergency services tend to be attracted to comparable "facilities" while the designation of "pub/restaurant TA B L E 2 Sensitivity analysis-Matched and unmatched ambulance records conveyed to hospitals in police data.a Based on Weisburd (2015), Sherman et al. (1989), and other studies, we assume that half of the incidents will occur within hotspots, that is, 500 "potentially preventable" violence-related injuries out of about 999 incidents unaccounted for by police records (82.2%), will happen within 100-m areas (total n = 1244 incidents).

Percentage
b "Roughly half of the police officers in each beat … are assigned 100% of their time to the top five hotspots in each beat" (Weisburd et al., 2017, p. 154), compared to nil control conditions.c "One-third of police officers are assigned to spend approximately 50% of their time at the top five hotspots in each beat" (Weisburd et al., 2017, p. 154), compared to nil control conditions.d Same as (b) above but compared to random patrols in control conditions.
e Same as (c) above but compared to random patrols in control conditions.
f Anticipated effect of city-wide hotspot policing (based on agent-based modeling by Weisburd et al., 2017).
and entertainment venues" as an anomaly carries significant implications for policing strategies meant to address the night-time economy and oversee licensed establishments.However, the fact that the police and ambulance services assist the same communities masks an ecological fallacy: most ambulance callouts to violent injuries are not visible within police-recorded crime within 100 m and 2 h of each other.This finding indicates that a high proportion of callouts to violence-related injuries are not the same incidents as those recorded in police crime data.Thus, ambulance callouts are often made to assaults that are, straightforwardly speaking, unknown to the police.
Our simulations allow us to make inferences pertaining to the impact of directing police patrols towards areas of high activity as identified by ambulance data in the Thames Valley region.As shown in Table 3, directing police attention towards hotspots will result in the targeting of roughly 50% of all injuries related to violence (Lee et al., 2017;Sherman et al., 1989;Weisburd, 2015; for other crime categories, see Ariel, Sutherland, et al., 2023).
Based on the evidence detected by Ariel et al. (2017)-that approximately 30% of the unreported incidents of violence will be prevented, our simulations indicate that a total of 113-116 injuries could be avoided through police proactive initiatives.In a hypothetical scenario characterized by a more conservative approach, wherein law enforcement agencies concentrate their efforts on a mere 5% of the overall distribution of events-that is, a single hotspot-a concentrated endeavor aimed at prevention has the potential to avert a total of 19 injuries.

| Policy implications
Police-recorded data constitute the predominant method for analyzing crime, an approach limited by the willingness of members of the public to disclose their victimization (see Clark et al., 2022;McKee et al., 2022) or the wiliness of the police to focus more intently on victims (Lay et al., 2023).Consequently, supplementary or complementary datasets are necessary (see discussion in Ariel, 2023;and Loewenstein et al., 2023).With the increased availability and utilization of health data, police forces can develop interagency data sharing in order to comprehend where and when violence occurs in their communities (Sutherland et al., 2021).Using ambulance data may improve the analytical approach of multi-agency violence reduction strategies by assisting police forces in identifying violent incidents that are not reported as crimes.
Both organizations would benefit from an understanding of the level of violence and the demand created by different areas and locales (e.g., Buil-Gil et al., 2022;Walsh & Smyth, 2022).However, the police are unlikely to obtain access to subject-level information from the ambulance services due to patient confidentiality concerns, so alternative means of sharing data ought to be explored.We demonstrate that data can be shared at certain spatial levels without de-anonymizing the records; ambulance services can share data continuously and in a timely manner on injuries at the hotspot level, who in turn can try to "cool them down."For example, proactive police patrols could be conducted to deter offenders from assaulting victims at these locations-based on previously reported assaults to the ambulance services at these locations.
While no field experiments using this method exist, we can speculate on some of the benefits of sharing hotspot-level data between the two emergency services.Using the 100-m distance and 12-h time windows, we show that there is only a 23.3% overlap between ambulance and police data; therefore, using more than twothirds of the data, new crime hotspot maps can be created.This strategy would ensure that police are regularly informed of patterns of violence, according to victimization data that are often unreported to the police but are collated by the ambulance services.Indeed, directing policing activity towards targeted hotspots reduces crime rates (see Ariel et al., 2020;Braga et al., 2019), and we can speculate that more violence can then be predicted and subsequently prevented using this supplementary data source.We call for future causal research to test this hypothesis more robustly.
In conclusion, this article demonstrates that hidden crimes can be uncovered by sharing information, such as ambulance calls and injuries, not at the level of the individual incident but rather at an aggregated or less granular spatial level in order to de-anonymize victims' identities.This study highlights the potential for health

F
Correlation between police-recorded violent crime and ambulance callouts to violent injury in the local policing areas (LPA) in the Thames Valley (1/07/16-30/06/17).F I G U R E 3 Correlation between police-recorded violent crime and ambulance callouts to violent injury at 100 × 100 m grid level in the Thames Valley (1/07/16-30/6/17).
Estimated effect of hotspot policing directed by ambulance hotspot maps on violence.