'Scared Straight' and other juvenile awareness programs for preventing juvenile delinquency

  • Review
  • Intervention

Authors


Abstract

Background

'Scared Straight' and other similar programs involve organized visits to prison by juvenile delinquents or children at risk for criminal behavior. Programs are designed to deter participants from future offending through firsthand observation of prison life and interaction with adult inmates. These programs remain in use despite research questioning their effectiveness. This is an update of a 2002 review.

Objectives

To assess the effects of programs comprising organized visits to prisons by juvenile delinquents (officially adjudicated, that is, convicted by a juvenile court) or pre-delinquents (children in trouble but not officially adjudicated as delinquents), aimed at deterring them from delinquency.

Search methods

To update this review, we searched 22 electronic databases, including CENTRAL, MEDLINE, PsycINFO, and Criminal Justice Abstracts, in December 2011. In addition, we searched clinical trials registries, consulted experts, conducted Google Scholar searches, and followed up on all relevant citations.

Selection criteria

We included studies that tested programs involving the organized visits of delinquents or children at risk for delinquency to penal institutions such as prisons or reformatives. Studies that had overlapping samples of juvenile and young adults (for example, ages 14 to 20 years) were included. We only considered studies that assigned participants to conditions randomly or quasi-randomly (that is, by odd/even assignment to conditions). Each study had to have a no-treatment control condition and at least one outcome measure of 'post-visit' criminal behavior.

Data collection and analysis

The search methods for the original review generated 487 citations, most of which had abstracts. The lead review author screened these citations, determining that 30 were evaluation reports. Two review authors independently examined these citations and agreed that 11 were potential randomized trials. All reports were obtained. Upon inspection of the full-text reports, two review authors independently agreed to exclude two studies, resulting in nine randomized trials. The lead review author extracted data from each of the nine study reports using a specially designed instrument. In cases in which outcome information was missing from the original reports, we made attempts via correspondence to retrieve the data for the analysis from the original investigators. Outcome data were independently checked by a second review author (CTP).

In this review, we report the results of each of the nine trials narratively. We conducted two meta-analyses of seven studies that provided postintervention offending rates using official data. Information from other sources (for example, self-report) was either missing from some studies or critical information was omitted (for example, standard deviations). We examined the immediate post-treatment effects (that is, 'first-effects') by computing odds ratios (OR) for data on proportions of each group reoffending, and assumed both fixed-effect and random-effects models in our analyses.

Main results

We have included nine studies in this review. All were part of the original systematic review; no new trials meeting eligibility criteria were identified through our updated searches. The studies were conducted in eight different states of the USA, during the years 1967 to 1992. Nearly 1000 (946) juveniles or young adults of different races participated, almost all males. The average age of the participants in each study ranged from 15 to 17 years.

Meta-analyses of seven studies show the intervention to be more harmful than doing nothing. The OR (fixed-effect) for effects on first post-treatment effect on officially measured criminal behavior indicated a negative program effect (OR 1.68, 95% confidence interval (CI) 1.20 to 2.36) and nearly identical regardless of the meta-analytic strategy (random-effects OR 1.72, 95% CI 1.13 to 2.62). Sensitivity analyses (random-effects) showed the findings were robust even when removing one study with an inadequate randomization strategy (OR 1.47, 95% CI 1.03 to 2.11), or when removing one study with high attrition (OR 1.96, 95% CI 1.25 to 3.08), or both (OR 1.68, 95% CI 1.10 to 2.58).

Authors' conclusions

We conclude that programs such as 'Scared Straight' increase delinquency relative to doing nothing at all to similar youths. Given these results, we cannot recommend this program as a crime prevention strategy. Agencies that permit such programs, therefore, must rigorously evaluate them, to ensure that they do not cause more harm than good to the very citizens they pledge to protect.

Résumé scientifique

Programme « Dissuasion par la peur » et autres programmes de sensibilisation destinés aux jeunes dans la prévention de la délinquance juvénile

Contexte

Le programme « Dissuasion par la peur » et d'autres programmes similaires consistent à inviter des délinquants juvéniles ou des enfants susceptibles de sombrer dans la délinquance à visiter des établissements pénitentiaires. Ces programmes visent à décourager toute récidive des participants en leur permettant d'avoir une première expérience de la vie en prison et d'interagir avec des détenus adultes. Ces programmes sont toujours d'actualité, malgré des recherches mettant en cause leur efficacité. Ceci est une mise à jour d'une revue publiée en 2002.

Objectifs

Évaluer les effets de programmes consistant à inviter des délinquants juvéniles (officiellement jugés, c'est-à-dire condamnés par un tribunal pour enfants) ou des prédélinquants (des enfants en difficulté, mais qui n'ont pas été officiellement jugés comme des délinquants) à visiter des établissements pénitentiaires afin de les dissuader de toute récidive.

Stratégie de recherche documentaire

En décembre 2011, nous avons mis à jour cette revue en effectuant des recherches dans 22 bases de données électroniques, notamment CENTRAL, MEDLINE, PsycINFO et Criminal Justice Abstracts. Nous avons également effectué des recherches dans des registres d'essais cliniques, consultés des experts, réalisés des recherches dans Google Scholar, ainsi qu'un suivi de toutes les références pertinentes.

Critères de sélection

Nous avons inclus des études examinant des programmes dans lesquels des délinquants ou des enfants exposés à la délinquance étaient invités à visiter des institutions pénitentiaires ou de réinsertion. Les études composées d'échantillons regroupant des adolescents et des jeunes adultes (par exemple : âgés de 14 à 20 ans) étaient incluses. Nous avons uniquement pris en compte celles ayant attribué ces conditions de manière aléatoire ou quasi aléatoire (c'est-à-dire, une attribution alternée des conditions) aux participants. Chaque étude devait disposer d'une condition de contrôle sans traitement et d'au moins une mesure de résultat du comportement criminel « post-visite ».

Recueil et analyse des données

Les méthodes de recherche de la revue d'origine ont généré 487 références, dont la majorité contenait des résumés. L'auteur principal de cette revue a analysé ces références dont 30 étaient des rapports d'évaluation. Deux auteurs de la revue ont indépendamment examiné ces références et convenu que 11 étaient des essais randomisés potentiels. Tous les rapports ont été obtenus. En inspectant les textes intégraux de ces rapports, deux auteurs de la revue ont indépendamment convenu d'exclure deux études. Par conséquent, neuf essais randomisés ont été pris en compte. L'auteur principal de la revue a extrait des données de chacun de ces neuf rapports d'étude à l'aide d'un instrument spécialement conçu à cet effet. Dans les cas où des informations sur des résultats venaient à manquer dans les rapports d'origine, nous avons essayé, par correspondance, d'obtenir des données d'analyse auprès des chercheurs d'origine. Le deuxième auteur de la revue (CTP) a indépendamment vérifié les données de résultats.

Dans cette revue, nous avons rapporté de manière narrative les résultats de chacun des neuf essais. Nous avons réalisé deux méta-analyses de sept études, indiquant des taux de récidive suite à l'intervention, à l'aide des données officielles. Des informations provenant d'autres sources, (par exemple : auto-signalées) ne figuraient pas dans certaines études ou des informations importantes étaient omises (par exemple : des écarts-types). Nous avons examiné les effets immédiats suite au traitement (c'est-à-dire, les « effets premiers »), en calculant les odds ratios (OR) des données sur les proportions de chaque groupe récidiviste, et supposé l'utilisation de modèles à effets fixes et aléatoires dans nos analyses.

Résultats principaux

Nous avons inclus neuf études dans cette revue. Toutes faisaient partie de la revue systématique d'origine ; aucun nouvel essai répondant aux critères d'éligibilité n'a été identifié dans nos recherches mises à jour. Ces études ont été réalisées dans huit États différents des États-Unis, entre 1967 et 1992. Presque 1 000 (946) adolescents ou jeunes adultes d'origines ethniques différentes ont participé, la plupart étant des garçons. L'âge moyen des participants de chaque étude variait de 15 à 17 ans.

Des méta-analyses composées de sept études montrent que l'intervention est plus dangereuse qu'inefficace. L'OR (à effets fixes) des effets concernant l'effet premier à la fin du traitement sur un comportement criminel officiellement mesuré indiquait un effet négatif du programme (OR 1,68, intervalle de confiance (IC) à 95 % 1,20 à 2,36) et quasi identique quelle que soit la stratégie méta-analytique utilisée (OR à effets aléatoires 1,72, IC à 95 % 1,13 à 2,62). Des analyses de sensibilité (à effets aléatoires) montraient que ces résultats étaient fiables, même en supprimant une étude utilisant une stratégie de randomisation inadaptée (OR 1,47, IC à 95 % 1,03 à 2,11) ou en supprimant une étude avec une attrition élevée (OR 1,96, IC à 95 % 1,25 à 3,08) ou les deux (OR 1,68, IC à 95 % 1,10 à 2,58).

Conclusions des auteurs

Nous concluons que les programmes, comme celui de « Dissuasion par la peur », augmentent la délinquance par rapport à l'absence d'intervention auprès de jeunes ayant un profil similaire. Au vu de ces résultats, nous ne pouvons recommander ce programme comme stratégie de prévention de la délinquance. Les agences autorisant ces programmes doivent donc rigoureusement les évaluer afin de s'assurer qu'ils ne sont pas plus dangereux qu'efficaces pour les citoyens mêmes qu'elles s'engagent à protéger.

Plain language summary

'Scared straight' and other juvenile awareness programs for preventing juvenile delinquency

Programs such as 'Scared Straight' involve organized visits to prison facilities by juvenile delinquents or children at risk for becoming delinquent. The programs are designed to deter participants from future offending by providing firsthand observations of prison life and interaction with adult inmates. This review, which is an update of one published in 2002, includes nine studies that involved 946 teenagers, almost all males. The studies were conducted in different parts of the USA and involved young people of different races whose average age ranged from 15 to 17 years. Results indicate that not only do these programs fail to deter crime, but they actually lead to more offending behavior. The intervention increases the odds of offending by between 1.6 to 1 and 1.7 to 1. Government officials permitting this program need to adopt rigorous evaluation efforts to ensure that they are not causing more harm to the very citizens they pledge to protect.

Résumé simplifié

Programme « Dissuasion par la peur » et autres programmes de sensibilisation destinés aux jeunes dans la prévention de la délinquance juvénile

Les programmes, comme « Dissuasion par la peur », consistent à inviter des délinquants juvéniles ou des enfants susceptibles de sombrer dans la délinquance à visiter des établissements pénitentiaires. Ces programmes visent à décourager toute récidive des participants en leur permettant d'avoir une première expérience de la vie en prison et d'interagir avec des détenus adultes. La présente revue, qui est la mise à jour d'une revue publiée en 2002, se compose de neuf études impliquant 946 adolescents, en majorité des garçons. Ces études ont été réalisées dans différentes régions des États-Unis et impliquaient des adolescents de différentes origines ethniques et dont l'âge moyen variait de 15 à 17 ans. Les résultats obtenus indiquaient que non seulement ces programmes n'avaient aucun effet dissuasif, mais aussi qu'ils aggravaient la criminalité. Ces interventions augmentent les risques de récidives de 1,6 à 1 et de 1,7 à 1. Par conséquent, les responsables gouvernementaux à l'initiative de ces programmes doivent rigoureusement les évaluer pour ne pas compromettre davantage la sécurité des citoyens mêmes qu'ils se sont engagés à protéger.

Notes de traduction

Traduit par: French Cochrane Centre 17th May, 2013
Traduction financée par: Pour la France : Minist�re de la Sant�. Pour le Canada : Instituts de recherche en sant� du Canada, minist�re de la Sant� du Qu�bec, Fonds de recherche de Qu�bec-Sant� et Institut national d'excellence en sant� et en services sociaux.

Резюме на простом языке

"Scared Straight” – “воспитание испугом" и другие подростковые просветительские программы для предотвращения преступности среди несовершеннолетних

Такие программы, как "Scared Straight” – “воспитание испугом" подразумевают визиты несовершеннолетних подростков и детей, имеющих риск стать правонарушителями (преступниками), в места лишения свободы (тюрьмы). Программы предназначены для удерживания участников от будущих правонарушений путем непосредственного наблюдения за тюремной жизнью и взаимодействия со взрослыми заключенными. Этот обзор является обновлением опубликованного ранее в 2002 году, включает девять исследований, вовлекших 946 подростков, практически все из которых были лицами мужского пола. Исследования проводились в разных частях США с вовлечением молодых людей различных рас, средний возраст которых колебался от 15 до 17 лет. Результаты демонстрируют, что эти программы не только не в состоянии сдерживать преступность, но они также могут привести к росту преступного поведения. Это вмешательство увеличивает шансы правонарушений в пределах между 1,6 - 1 и 1,7 - 1. Правительственным чиновникам, одобряющим подобные программы, необходимо прилагать усилия по более строгой оценке, чтобы гарантировать, что подобные программы не станут причиной вреда гражданам, которых они обязаны защищать.

Заметки по переводу

Перевод: Гамирова Римма Габдульбаровна. Редактирование: Зиганшина Лилия Евгеньевна. Координация проекта по переводу на русский язык: Казанский федеральный университет. По вопросам, связанным с этим переводом, пожалуйста, свяжитесь с нами по адресу: lezign@gmail.com

Background

Description of the condition

Juvenile delinquency, also known as juvenile offending or youth crime, is illegal behavior committed by someone before becoming an adult. The second United Nations Congress on the Prevention of Crime and Treatment of the Offender recommended that the meaning of the term juvenile delinquency should be restricted as far as possible to violations of the criminal law (Kvaraceus 1964). Juveniles are considered to be those persons who have yet to reach age 18 years. Although laws vary across nations, juvenile delinquents, therefore, would be those who have been found guilty (adjudicated) of committing a law violation before they are 18 years of age. A significant percentage of violent and nonviolent offenses are committed by juveniles. For example, in the USA, 15% of all persons arrested by the police for illegal behavior in 2008 were juveniles (US Census 2012). Besides the problem of youth crime, offending as a juvenile is a risk factor for later involvement with the criminal justice system as an adult (McCord 2001). Thus, governments everywhere are looking for effective interventions to address juvenile delinquency. 'Scared Straight' and similar type programs have been used in various places in the world, and offer a low-cost and easy to implement strategy to prevent juvenile delinquency.

Description of the intervention

The basic component of programs such as Scared Straight is organized visits to prison facilities by juvenile delinquents or children at risk for becoming delinquent. Nearly all of these interventions have the juveniles interact with inmates confined in the facility. The most famous of these, 'Scared Straight' in New Jersey (USA), included confrontational 'rap' sessions in which adult inmates shared graphic stories about prison life with the juveniles. Other programs have included less confrontational and more educational sessions, in which inmates shared their life stories and described the choices they made that ultimately led to imprisonment. In the Texas Face-to-Face program, juveniles spent one day living as an adult prisoner and the intervention also included a counseling component.

The most well-known version of the Scared Straight type programs was initiated in the 1970s, as inmates serving life sentences at a New Jersey prison began a program to 'scare' or deter at-risk or delinquent children from a future life of crime. It featured as its main component an aggressive presentation by inmates to juveniles visiting the prison facility. The presentation depicted life in adult prisons, and often included exaggerated stories of rape and murder (Finckenauer 1982). A television documentary on the program aired in 1979 provided evidence that 16 of the 17 delinquents remained law-abiding for three months after attending Scared Straight, and claimed a 94% success rate (Finckenauer 1982). Other data provided in the film indicated success rates that varied between 80% and 90% (Finckenauer 1982). The program received considerable and favorable media attention and was soon replicated in over 30 states nationwide, resulting in special Congressional hearings on the program and the film by the US House Subcommittee on Human Resources (US HCEL 1979).

Scared Straight and other 'kids visit prison' programs are also used in other nations. For example, the 'day in prison' or 'day in gaol' in Australia (O'Malley 1993), 'day visits' in the UK (Lloyd 1995) and the 'Ullersmo Project' in Norway (Storvoll 1998). Hall 1999 reports positively on a program in Germany designed to deter young offenders with ties to Neo-Nazi and other organized hate groups. Scared Straight has been also tried in Canada (O'Malley 1993). In 1999, 'Scared Straight: 20 Years Later' (UPN 1999; 'Kids and Crooks') was shown on US television and claimed similar results to the 1979 film. In this version, the film reports that 10 of the 12 juveniles attending the program remained offense-free in the three months' follow-up (Muhammed 1999). As in the 1979 television program, no data on a control or comparison group of young people were presented. Positive reports and descriptions of Scared Straight-type programs have also been reported in Germany (Hall 1999) and in Florida (USA) (Rasmussen 1996). Sometimes the program is embedded as one component in a multicomponent juvenile intervention package (Trusty 1995; Rasmussen 1996).

How the intervention might work

The underlying theory of programs such as Scared Straight is deterrence. Program advocates and others believe that realistic depictions of life in prison and presentations by inmates will deter juvenile offenders or children at risk for becoming delinquent from further involvement with crime. Although the harsh and sometimes vulgar presentation in the earlier New Jersey version is the most well known, inmate presentations are now sometimes designed to be more educational than confrontational but with a similar crime prevention goal (Lundman 1993; Finckenauer 1999). Some of these programs feature discussions in which the adult inmates confront and challenge the juveniles about their behavior, also referred to as 'rap sessions'. Programs featuring inmates as speakers who describe their life experiences and the current reality of prison life have a rather long history, in the USA at least (Michigan D.O.C. 1967; Brodsky 1970).

Why it is important to do this review

In 1982, a randomized controlled trial testing the New Jersey program was published, reporting no effect on the criminal behavior of participants in comparison with a no-treatment control group (Finckenauer 1982). In fact, Finckenauer reported that participants in the experimental program were more likely to be arrested. Other randomized trials reported in the USA also questioned the effectiveness of Scared Straight-type programs in reducing subsequent criminality (GERP&DC 1979; Lewis 1983).

Despite the convergence of evidence from these studies, Scared Straight-type programs remained popular and continued to be used in the USA through the 1990s (Finckenauer 1999). For example, a program in Carson City, Nevada (USA) took juvenile delinquents on a tour of an adult Nevada State Prison (Scripps 1999). One youngster claimed that the part of the tour that made the most impact on him was, "all the inmates calling us for sex and fighting for our belongings" (Scripps 1999). The United Community Action Network has its own program called 'Wisetalk' in which at-risk youth are locked in a jail cell for over one hour with four or five parolees. They claim that only 10 of 300 youngsters exposed to this intervention were re-arrested (U-CAN 2001). In 2001, a group of guards - apparently without the knowledge of administrators - strip-searched Washington DC students during their tours of a local jail under the guise that they were using "a sound strategy to turn around the lives of wayward kids" - claiming the prior success of Scared Straight (Blum 2001). It is not surprising that such programs are popular: they fit with some commonly held notions about how to prevent or reduce crime (by 'getting tough'); they are very inexpensive (a Maryland program was estimated to cost less than USD1 US per participant); and they provide one way for incarcerated offenders to contribute productively to society by preventing youngsters from following the same path (Finckenauer 1982).

In 2000, Petrosino and his colleagues reported on a preliminary systematic review of nine randomized field trials, drawing on the raw percentage differences in each study (Petrosino 2000). They found that programs such as Scared Straight generally increased crime between 1% and 28% in the experimental group when compared to a no-treatment control group. In 2002, our formal Cochrane review was published (Petrosino 2002) (simultaneously as a pilot Campbell Collaboration review), which updated the 2000 work and used more sophisticated meta-analytic techniques. We reported similarly negative findings for Scared Straight and juvenile-awareness programs.

Still, Scared Straight type programs continue. In 2003, then-Governor of Illinois, Rod Blagojevich, signed a bill into law that mandated the Chicago Public School system set up a program called 'Choices' (Swanson 2003). The program would identify students at risk for committing future crime and set up a program to give them 'tours of state prison' to discourage any future criminal conduct (Swanson 2003). More recently, the Arts and Entertainment (A&E) station has been running a weekly series entitled 'Beyond Scared Straight'. Created by the producer of the original Scared Straight program (Arnold Shapiro), the program is now the highest rated in A&E's history (Denhart 2011). The success of the television show has renewed interest in Scared Straight and similar programs as a crime prevention strategy (for example, Denhart 2011), but has also resulted in criticism that it ignores a long history of scientific evidence (for example, Robinson 2011).

The question about whether Scared Straight and similar programs have a crime deterrent effect is best answered by continued examination of the existing scientific evidence. The current review updates the version published in 2002 and includes new and extended searches to December 2011, as well as additional analyses.

Objectives

To assess the effects of programs comprising organized visits to prisons of juvenile delinquents (officially adjudicated or convicted by a juvenile court) or predelinquents (children in trouble but not officially adjudicated as delinquents), aimed at deterring them from criminal activity.

Methods

Criteria for considering studies for this review

Types of studies

Only studies that used randomization or quasi-random procedures (that is, alternate assignment such as all odd numbered cases to treatment and even numbered cases to control) to assign participants, with or without blinding, were included, provided they had a no-treatment control group.

Types of participants

Only studies involving juveniles, that is children 17 years of age or younger, were included. Participants were delinquents or predelinquents. Studies that contain overlapping samples of juveniles and young adults (for example, ages 13 to 21 years) were also included.

Types of interventions

Only studies that featured as their main component a visit by program participants to a prison facility were included. Programs may include a presentation by the inmates, ranging from graphic (Finckenauer 1982) to educational (Cook 1992). Additionally, programs may feature an orientation session (for example, living as a prisoner for eight hours) or a tour of the facility.

Types of outcome measures

Primary outcomes

The interest of citizens, policy and practice decision-makers, media, and the research community is in whether Scared Straight and its variations have any crime deterrent effect, therefore crime measures are our primary outcomes. Studies had to report at least one outcome of subsequent offending behavior, as measured by such indices as arrests, convictions, contacts with police or self-reported offenses.

Secondary outcomes

We had no secondary outcomes in our analysis, although 'non-crime' measures (for example, attitudinal, educational) reported by the primary investigators are included in Table 1 to enable review authors in the Cochrane and Campbell Collaborations to identify potentially eligible studies for their systematic reviews.

Table 1. Crime outcome data reported in original studies
Study ReferenceAt 3 monthsAt 6 monthsAt 9 monthsAt 12 monthsBeyond 12 months
Michigan D.O.C. 1967 Percentage with new offense or new violation of probation   
GERP&DC 1979 Percentage subsequently contacted by police   
Yarborough 1979Percentage with new offenses, type of offenses, percentage with new petitions, average offense rate and standard deviations, average weeks to new offense and standard deviations, number of days in detention and standard deviationsPercentage with new offenses, type of offenses, percentage with new petitions, average offense rate and standard deviations, average weeks to new offense and standard deviations, average days in detention and standard deviations   
Orchowsky 1981 Percentage with new intakes, average intakes (no standard deviations but test statistic), average severity score (no standard deviations but test statistic)Percentage with new intakes, average intakes (with no standard deviations but test statistic) and average severity score (no standard deviations but test statistic)Percentage with new intakes, average intakes (no standard deviations but test statistic), average severity score (no standard deviations but test statistic) 
Vreeland 1981 Percentage with new offenses (official measures), percentage with new offenses (self-reported data)   
Finckenauer 1982 Percentage new complaints, contacts or court appearances, average severity score (no standard deviation, but test statistic)   
Lewis 1983   Percentage arrested, percentage charged, average arrests (no standard deviation), average charges (no standard deviation), average time to first arrest (no standard deviation) 
Locke 1986 Only test statistic reported   
Cook 1992   Average offenses (no standard deviations), average severity score (no standard deviations)Average offenses (no standard deviations), average severity score (no standard deviations)

Search methods for identification of studies

To minimize publication bias, we conducted a search strategy designed to identify published and unpublished studies. We also conducted a comprehensive search strategy to minimize discipline bias, that is, that evaluations reported in criminological journals or indexed in field-specific abstracting databases might differ from those reported in psychological, sociological, social service, public health or educational sources. The search methods for the original review are described in detail in Appendix 1.

In December 2011 we searched 11 of the 16 previously searched databases, and expanded our searches to include an additional nine bibliographic sources. We searched all available years of the additional sources, and limited the search of the databases used previously to 2001 onwards. The five databases not searched for this update included one that was no longer accessible (C2-Spectr), and four that produced zero yield in the previous searches (Current Contents, GPO Monthly, National Clearinghouse of Child Abuse and Neglect (NCCAN) abstracts, and Political Science Abstracts). In November 2012 we also searched two trials registers. The 22 databases searched during the update were:

  • Cochrane Central Register of Controlled Trials (CENTRAL), 2011(4), searched December 2011

  • Academic Search Premier, all available dates to December 2011

  • Ovid MEDLINE, 2001 to December 2011

  • Clinical Trials.Gov, all available dates, searched November 2012

  • Criminal Justice Abstracts, 2001 to December 2011

  • Directory of Open Access Journals, all available dates to December 2011

  • Dissertations and Theses (ProQuest), which covers Dissertation Abstracts, 2001 to December 2011

  • Education FullText, 2001 to December 2011

  • ERIC (Proquest), 2001 to December 2011

  • Google Scholar, all available dates, searched December 2011

  • HeinOnline, all dates to December 2011

  • Illinois Researcher Information Service (IRIS), all dates to December 2011

  • International Bibliography of the Social Sciences, 2001 to December 2011

  • National Criminal Justice Reference Service Abstracts Database (NCJRS), 2001 to December 2011

  • Public Affairs Information Service (PAIS), 2001 to December 2011

  • PsycArticles, all dates to December 2011

  • PsycINFO, 2001 to December 2011

  • SCOPUS Science Direct, all dates to December 2011

  • Scandinavian Research Council for Criminology, all dates to December 2011

  • Sociofile, including Sociological Abstracts and Social Planning and Development Abstracts, 2001 to December 2011

  • SSCI (Web of Science), which includes the Social Science Citation Index (SSCI), 2001 to December 2011

  • World Health Organization International Clinical Trials Registry Platform (ICTRP), searched November 2012

Our keywords were similar to those used in the previous two searches. A list of search terms is provided in Appendix 1.

We also contacted an informal list of researchers in the field, and examined citations in relevant literature, including previous systematic and narrative reviews. We did not limit our results to English language journals, and did retrieve some abstracts in Spanish (but none to empirical studies), but one limitation is that our search terms were entered in English. Our next update will include a wider range of terms and translation of these terms into Spanish and French languages.

Data collection and analysis

Selection of studies

AP screened citations generated for the original review. AP and CTP independently examined these citations. Full reports were obtained for 11 potential randomized trials. Both review authors agreed that two of these should be excluded. Arbitration was not required as the two review authors agreed. For this update, two review authors (MHP and JL) scanned each citation and determined that there were no trials suitable for inclusion in this review. Details of six new 'excluded studies' with reasons for exclusion are provided in Excluded studies.

Data extraction and management

AP extracted data from each of the nine main study reports using a specially designed instrument adapted from his earlier study (Petrosino 1997), and included items are listed in the 'Characteristics of included studies'. Where outcome information was missing from the original reports, we made attempts via email and regular mail correspondence to retrieve the data for the analysis from the original investigators. Investigators were helpful but unable to locate additional data. In two cases we retrieved unpublished Masters' theses from university libraries to see if they contained this information (Locke 1984; Cook 1990). They did not. Another review author (CTP) double checked all extracted data on outcomes to ensure they were correct.

Assessment of risk of bias in included studies

For each study, we assessed methodological quality using the Cochrane 'Risk of bias' tool. The study reports generally lacked explicit details about randomization and concealment, and the 'Risk of bias' ratings reflect the uncertainty stemming from this lack of description. The Cochrane 'Risk of bias' tool asks review authors to rate each of the following areas of risk:

  1. random sequence generation;

  2. allocation concealment;

  3. blinding of participants and personnel;

  4. blinding of outcome assessment;

  5. incomplete outcome data (attrition);

  6. selective reporting;

  7. other sources of bias. Here we rated whether the implementation of the program rendered a fair test. This is a very low cost and easy to implement program, and no reports included details of program implementation problems.

Measures of treatment effect

Studies had to include at least one outcome of subsequent offending behavior, as measured by such indices as arrests, convictions, contacts with police or self-reported offences. The interest of citizens, policy and practice decision-makers, media and the research community is in whether Scared Straight and other kids visit prison programs have any effect on these measures. Although we do not analyze them, we list other 'noncrime measures' and their effects (for example, attitudinal, educational) reported by evaluators in case subsequent review authors in the Cochrane or Campbell Collaborations require them. 

Unit of analysis issues

All of the included studies involved randomization of individuals to conditions. No cluster-randomized trials were located. Most studies involved a single treatment and a single control group; in one instance in which multiple groups were involved (Vreeland 1981), we only included data from the strongest contrast (the most intensive treatment versus control).

Dealing with missing data

As mentioned earlier, we made unsuccessful attempts to acquire missing outcome data for two studies (Locke 1986; Cook 1992). Due to the lack of subsequent follow-up intervals for outcome measurement in the included studies, we focused exclusively on first treatment effects. This likely limited missing outcome data problems as only one study experienced postrandomization attrition (Yarborough 1979). We examined the impact of excluding this study in a sensitivity analysis, discussed below.

Assessment of heterogeneity

The included studies represent some variation in geographic locations, specific types of interventions implemented, and juvenile treatment populations. Thus, heterogeneity should be examined, although the small number of included studies makes interpretation risky. The Chi2 and I2 statistics for heterogeneity are reported.

Assessment of reporting biases

Seven studies were included in the meta-analyses, and just two were published in academic peer-reviewed publications (Finckenauer 1982; Lewis 1983). Therefore, we do not believe publication bias is a threat to the results. In the future, if additional studies are located, we will include Egger's regression test for funnel plot asymmetry (Egger 1997).

Data synthesis

Using Review Manager software (RevMan 2011), we expressed dichotomous outcome measures of crime as odds ratios (OR). We reported the 95% confidence intervals (CI). Both fixed-effect and random-effects models were assumed across the randomized trials and compared to assess the impact of statistical heterogeneity, and both were reported. We examined OR at first follow-up interval, that is, first post-treatment effect.

Subgroup analysis and investigation of heterogeneity

No subgroup analyses were determined a priori at the protocol stage. We did not change our plans, given that only seven studies that included outcome data for analysis. Thus, we did not explore heterogeneity by conducting analyses of subgroups or moderators.

Sensitivity analysis

We conducted two sensitivity analyses that examined the impact on the results of excluding studies with significant methodological issues. The first analysis involved dropping a study that experienced randomization problems (Finckenauer 1982). The second sensitivity analysis involved dropping a study that involved substantial postrandomization attrition (Yarborough 1979).

Results

Description of studies

Whether relying on the actual data reported or measures of statistical significance, the nine trials do not yield evidence for the effectiveness of 'Scared Straight' and other juvenile awareness programs on subsequent delinquency. 

Michigan Department of Corrections (1967)

In an internal, unpublished government document, the Michigan Department of Corrections reported a trial testing a program that involved taking adjudicated juvenile boys on a tour of a state reformatory (Michigan D.O.C. 1967). Unfortunately, the report is remarkably brief. Sixty juvenile delinquent boys were randomly assigned to attend two tours of a state reformatory or to a no-treatment control group. Tours included 15 juveniles at a time. No other part of the program is described. Recidivism was measured as a petition in juvenile court for either a new offense or a violation of existing probation order. The Michigan Department of Corrections found that 43% of the experimental group reoffended, compared to only 17% of the control group. This large negative result curiously receives little attention in the original document. 

The Greater Egypt Planning and Development Commission, Illinois, USA (1979)

This program at the Menard Correctional Facility started in 1978 and is described as a frank and realistic portrayal of adult prison life. The researchers randomly assigned 161 youths aged 13 to 18 years to attend the program or a no-treatment control. The participants were a mix of delinquents or children at risk of becoming delinquent. Participants were compared on their subsequent contact with police, on two personality inventories (Piers-Berne and Jesness) and used surveys of parents, teachers, inmates and young people. The outcomes are also negative in direction but not statistically significant, with 17% of the experimental participants being recontacted by police in contrast to 12% of the controls (GERP&DC 1979). The authors concluded that, "Based on all available findings one would be ill advised to recommend continuation or expansion of the juvenile prison tours. All empirical findings indicate little positive outcome, indeed, they may actually indicate negative effects" (p. 19). Researchers report no effect for the program on two attitude tests (Jesness Inventory, Piers Harris Self-Concept Scale). In contrast, interview and mail surveys of participants and their parents and teachers indicated unanimous support for the program (p. 12). Researchers also note how positive and enthusiastic inmates were about their efforts. 

Michigan JOLT Study, USA (Yarborough 1979)

In the Juvenile Offenders Learn Truth (JOLT) program, juvenile delinquents in contact with one of four Michigan county courts participated. Each juvenile spent five total hours in the facility. Half of this time was spent in a confrontational 'rap' session. This followed a tour of the facility, during which participants were escorted to a cell and exposed to interaction with inmates (for example, taunting). In the evaluation, 227 youngsters were randomly assigned to JOLT or to a no-treatment control. Participants were compared on a variety of crime outcomes collected from participating courts at three and six months' follow-up. This second Michigan study reported very little difference between the intervention and control group (Yarborough 1979). The average offense rate for program participants, however, was 0.69 compared to 0.47 for the control group. Yarborough (p. 14) concluded that, "…the inescapable conclusion was that youngsters who participated in the program, undergoing the JOLT experience, did no better than their control counterparts."

Virginia Insiders Program, USA (Orchowsky and Taylor 1981)

The Insiders Program was described as an inmate-run, confrontational intervention with verbal intimidation and graphic descriptions of adult prison life. Juveniles were locked in a cell 15 at a time and told about the daily routine by a guard. They then participated in a two-hour confrontational rap session with inmates. Juvenile delinquents from three court service units in Virginia participated in the study. The investigators randomly assigned 80 juveniles ages 13 to 20 years with two or more prior adjudications for delinquency to the Insiders program or a no-treatment control group. Orchowsky and Taylor report on a variety of crime outcome measures at six-, nine-, and 12-month intervals. The only positive findings, though not statistically significant, were reported in Virginia (Orchowsky 1981). Although the difference at six months was not statistically significant (39% of controls had new court intakes versus 41% of experimental participants), they favor the experimental participants at nine and 12 months. The investigators noted, however, that the attrition rates in their experiment were dramatic. At nine months, 42% of the original sample dropped out, and at 12 months, 55% dropped out. The investigators conducted analyses that seemed to indicate that the constituted groups were still comparable on selected factors.

Texas Face-to-Face Program, USA (Vreeland 1981)

The Face-to-Face program included a 13-hour orientation session in which the juvenile lived as an inmate followed by counseling. Participants were 15 to 17 years of age and on probation from Dallas County Juvenile Court; most averaged two or three offenses before the study. A total of 160 boys were randomly assigned to four conditions: prison orientation and counseling, orientation only, counseling only or a no-treatment control group. Vreeland examined official court records and self-reported delinquency at six months. This evaluation also reported little effect for the intervention (Vreeland 1981). Vreeland reported that the control participants outperformed the three treatment groups on official delinquency (28% delinquent for control versus 39% for prison orientation plus counseling versus 36% for prison onlyversus 39% for counseling only). This more robust measure contradicts data from the self-report measures used, which suggest that all three treatment groups did better than the no-treatment controls. None of these findings reached a level of statistical significance. Viewing all the data, Vreeland concluded that there was no evidence that Face-to-Face was an effective delinquency prevention program. He finds no effect for Face-to-Face on several attitudinal measures, including the 'Attitudes Toward Obeying Law Scale.'

New Jersey 'Scared Straight' Program, USA (Finckenauer 1982)

The New Jersey Lifers' Program began in 1975 and stressed confrontation with groups of juveniles ages 11 to 18 years who participated in a rap session. Finckenauer randomly assigned 82 juveniles, some of whom were not delinquents, to the program or to a no-treatment control group. He then followed them for six months in the community, using official court records to assess their behavior. Finckenauer reported that 41% of the children and young people who attended the 'Scared Straight' program in New Jersey committed new offenses, while only 11% of the controls did, a difference that was statistically significant (Finckenauer 1982). He also reported that the program participants committed more serious offenses and that the program had no impact on nine attitude measures with the exception of a measure called 'attitudes toward crime.' On this measure experimental participants did much worse than controls. We deal with Finckenauer's own concerns about randomization integrity in a sensitivity analysis that is reported later. 

California SQUIRES Program, USA (Lewis 1983)

This is supposedly the oldest such program in the USA beginning in 1964 (Lewis 1983). The San Quentin Utilization of Inmate Resources, Experience and Studies (SQUIRES) program included male juvenile delinquents from two California counties between the ages of 14 and 18 years, most with multiple prior arrests. The intervention included confrontational rap sessions with rough language, guided tours of prison with personal interaction with prisoners, and a review of pictures depicting prison violence. The intervention took place one day per week over three weeks. The rap session was three hours long, and normally included 20 youngsters at a time. In the study, 108 participants were randomly assigned to treatment or to a no-treatment control group. Lewis compared participants on seven crime outcomes at 12 months. Lewis reported that 81% of the program participants were arrested compared to 67% of the controls. He also found that the program did worse with seriously delinquent youths, leading him to conclude that such children and young people could not be "turned around by short-term programs such as SQUIRES…a pattern for higher risk youth suggested that the SQUIRES program may have been detrimental" (p. 222). The only deterrent effect for the program was the average length of time it took to be rearrested: 4.1 months for experimental participants and 3.3 months for controls. Data were reported on eight attitudinal measures, and Lewis reported that the program favored the experimental group on all of them, again underscoring the difficulty of achieving behavioral change even when positively affecting the attitudes of juvenile delinquents.

Kansas Juvenile Education Program, USA (Locke et al. 1986)

Kansas Juvenile Education Program (KEP) was designed to educate children about the law and the consequences of violating it (Locke 1986). The program also tried to match juveniles with inmates based on personality types. Fifty-two juvenile delinquents aged 14 to 19 years from three Kansas counties were randomly assigned while on probation to KEP or a no-treatment control. The investigators examined official (from police and court sources) and self-report crime outcomes at six months. Locke and his colleagues reported little effect of the KEP program. Both groups improved from pretest to post-test but the investigators concluded that there were no differences between experimental and control groups on any of the crime outcomes measured. Investigators also reported no effect for the program on the Jesness and Cerkovich attitude tests.

Mississippi Project Aware, USA (Cooke and Spirrison 1992)

Project Aware was a nonconfrontational, educational program comprising one five-hour session run by prisoners (Cook 1992). The intervention was delivered to juveniles in groups of six to 30. In the study, 176 juveniles (ages 12 to 16 years) under the jurisdiction of the county youth court were randomly assigned to the program or to a no-treatment control. The experimental and control groups were compared on a variety of crime outcomes retrieved from court records at 12 and 24 months. Little difference was found between experimental and control participants in the study. For example, the mean offending rate for controls at 12 months was 1.25 for control cases versus 1.32 for Project Aware participants. Both groups improved from 12 to 24 months, but the control mean offending rate was still lower than the experimental group. The investigators concluded that, "attending the treatment program had no significant effect on the frequency or severity of subsequent offenses" (p. 97). The investigators also reported on two educational measures: school attendance and dropout. Curiously, they report an effect for the program on school dropout data, but not that "…it is not clear how the program succeeded in reducing dropout rates…" (p. 97). 

Results of the search

The search methods for the original review generated 487 citations, most of which had abstracts. AP screened these citations, determining that 30 were evaluation reports. AP and CTP independently examined these citations and agreed that 11 were potential randomized trials. All reports were obtained. Upon inspection of the full-text reports, we excluded two studies. One study was excluded because it did not include any post program measure of offending. This was 'Project Aware', which had been conducted in a Wisconsin prison (Dean 1982). Attempts to contact the study author or retrieve these data from any other reports by the Wisconsin Department of Corrections have been unsuccessful. A second study of 'Stay Straight', conducted in Hawaii, was also excluded, due to the absence of random assignment (Chesney-Lind 1981). After the two exclusions, we were left with nine randomized trials.

Our updated searches yielded no new eligible studies or reports of any ongoing trials. Two review authors (MHP and JL) scanned each citation and identified five potentially relevant reports. One, an evaluation of a Scared Straight program for truants, was excluded because it did not involve randomization (Bazemore 2004). Another study was excluded because it did not include eligible outcome measures; it measured change in attitudes toward jail or prison (Feinstein 2005). Two articles discussed a related 'experiment' (Blunkett 2008; Wilson 2010), but upon further examination we discovered these studies did not use experimental methods or eligible outcomes. Another positive descriptive report was identified of a juvenile awareness program involving 'fear appeal messages' (Windell 2005), but no evaluative data were provided. A systematic review (Klenowski 2010) was identified that included narrative descriptions of 10 studies, but it contained no new studies eligible for inclusion in our review.Thus, information contained in this update is based on studies located for the previous review.

Included studies

Collectively, the nine studies were conducted in eight different states of the USA, with Michigan the site for two studies (Michigan D.O.C. 1967; Yarborough 1979). No set of researchers conducted more than one experiment. The studies span the years 1967 to 1992. The first five studies located were unpublished and were disseminated in government documents or dissertations; the remaining four were found in academic journal or book publications. The average age of the juvenile participants in each study ranged from 15 to 17 years. Only the New Jersey study included girls (Finckenauer 1982). Racial composition across the nine studies was diverse, ranging from 36% to 84% white people. Most of the studies dealt with delinquent youths already in contact with the juvenile justice system. All of the experiments were simple two-group experiments except Vreeland's evaluation of the Texas Face-to-Face program (Vreeland 1981). Only one study used quasi-random alternation techniques to assign participants (Cook 1992); the remaining studies claimed to use randomization although not all were explicit about how such assignment was conducted. Only the Texas study (Vreeland 1981) included data from self-report measures. In two studies (Locke 1986; Cook 1992), no postintervention offending rates were reported. Some of the studies that included average or mean rates did not include standard deviations to make it possible to compute the weighted mean effect sizes. Also, the follow-up periods were diverse and included measurements at three, six, nine, 12 and 24 months.

Excluded studies

There were six studies that were excluded during this update. These are often included in other review authors' samples. We describe these in more detail below, along with their reason for exclusion.

Bazemore 2004 evaluated a 'Scared Straight' program for truants, however their study did not involve randomization. This program involved a collaborative intervention administered by a local sheriff's department. It followed 550 youth (350 'treatment' and 200 'control'). Three outcome measures were used: (1) whether or not youths returned to school the next day or were stopped by an officer (different measures for treatment and control youths), (2) comparison of the number of unexcused absences 30 days pre-intervention and postintervention and (3) total number of days of school missed following the intervention. Delinquent involvement was also measured. This study provided mixed results regarding program effectiveness. 

Berry 1985 evaluated the 'Shape Up' program carried out in Colorado. The experimental group consisted of 30 males ages 14 to 18 years, and the control group consisted of 27 males of the same age. The study used a matched comparison group design and did not use randomization. The study assessed perception of certainty, severity and seriousness of punishment; delinquency proneness; intelligence quotient (IQ); family dynamics as measured by Family Adaptability and Cohesion Evaluation Scale (FACES) II, and recidivism rates. No difference was found between the two groups on attitude change, re-arrest, conviction, and weighted seriousness of crime after program involvement.

Buckner 1983 evaluated a program, 'Stay Straight', which was carried out in Hawaii. This study did not randomize participants and instead used a matched comparison group design. They assessed rearrest rates, finding that there was no effect on female participants. Male participants had higher rearrest rates than nonparticipants following the intervention. 

Chesney-Lind 1981 evaluated a program, 'Stay Straight', which was carried out in Hawaii. This study was excluded due to a lack of random assignment of participants. An after-the-fact matched group design was used in this study. The frequency and severity of police arrests in the year following program exposure was used as an outcome measure. 

Dean 1982 evaluated a two-session juvenile awareness program in Wisconsin. This study used a small sample of boys who were involved in a residential treatment program for delinquents. The study assessed 13 traits thought to be associated with a delinquent personality finding internal locus of control had increased significantly, while chance expectation and social self concept had decreased significantly. A pretest-post-test design with randomization was used, but no data on delinquency outcomes were collected.

Langer 1980 evaluated the Juvenile Awareness Program of the Lifers' Group at the Rahway State Prison in New Jersey. This study used a matched comparison group design. The study assessed delinquent involvement, finding that at the 10-month follow-up there was no significant difference between treatment and control groups. At long-term (average of 22 months) follow-up, the control group had significantly higher delinquency rates than the treatment group. 

Risk of bias in included studies

Review authors AP and MHP rated quality of included studies using The Cochrane Collaboration's 'Risk of bias' tool (Higgins 2011). Unfortunately, clear data on all seven items in the 'Risk of bias' tool was not included in study reports. Figure 1 provides summary results, and we discuss each of the rating areas below.

Figure 1.

Risk of bias summary: review authors' judgments about each risk of bias item for each included study

Green circle: low risk of bias
Question mark: unclear risk of bias
Red circle: high risk of bias

Allocation

Random sequence generation

Finckenauer reported violations of randomization (Finckenauer 1982). Only eight of the 11 participating agencies that referred troubled or delinquent boys to the program correctly assigned their cases. Finckenauer did conduct additional analyses in an attempt to compensate for violation of randomization. We agreed that a sensitivity analysis should be done to determine the influence of this evaluation on the pooled analysis (Analysis 1.3). Another study was rated as at high risk of bias because alternation was used (Cook 1992). This latter study was not included in the meta-analysis because it did not include data on postintervention offending. Two other studies did not provide any further information on randomization and their risk of bias was rated as 'unclear' (GERP&DC 1979; Michigan D.O.C. 1967).

Allocation concealment

All of the studies are rated as presenting 'unclear' risk as there is no information on how randomization was performed.

Blinding

Blinding of participants and personnel (performance bias)

Blinding was not possible in these studies, and all are rated as presenting ''high risk'.

Blinding of outcome assessment (detection bias)

We should note that only one study author reported that steps were taken to 'blind' those responsible for collecting the outcome data to treatment assignment (Michigan D.O.C. 1967) and is rated as presenting 'low risk'. All others are rated as presenting 'unclear risk'.

Incomplete outcome data

Six studies experienced little or no attrition and are rated as presenting 'low risk' of bias. Two studies appeared to report significant attrition (defined as 10% or more from the originally randomized sample). The Virginia Insiders study reported a major loss of participants from the initial randomization sample (Orchowsky 1981). They reported this, however, at the second and third follow-up intervals (not the first, at six months). Because there was a paucity of data beyond the immediate follow-up interval across studies, we only conducted a pooled analysis using data at that time interval. Therefore a sensitivity analysis of the impact of this later attrition was not performed. The Cook study is also rated as presenting a 'high risk' due to attrition, but the study did not include data for the first follow-up and was not included in any meta-analyses (Cook 1992).

The Michigan JOLT study reported a large number of no-shows but they were deleted from the analysis (Yarborough 1979). The problem is that we do not know how many participants were initially assigned and no data were reported that the remaining sample was similar to the initial sample. We also conducted a sensitivity analysis to determine the influence of this study on the pooled analysis.

Selective reporting

We rated this as presenting a 'low risk' of bias across the studies. In several cases, the program was a government intervention and the researchers were employed by the same agency; nonetheless, the negative or null findings were clearly presented (Michigan D.O.C. 1967; GERP&DC 1979; Yarborough 1979; Orchowsky 1981; Lewis 1983). In three instances, the authors were students and a number of outcomes were presented (Vreeland 1981; Locke 1986; Cook 1992). In another instance, the author was an academic researcher who presented a number of findings in an academic book (Finckenauer 1982).

Other potential sources of bias

In terms of 'other bias' as rated on the tool, a major threat to study results is if the program is so poorly implemented that it does not represent a true test of the treatment. Scared Straight programs appear to be relatively simple and short-term and pose few problems for implementation. No investigator reported implementation problems, and we rated these as 'low risk' of bias. We should note that not one of the nine included studies provided data on monitoring of the control group to determine if compensation was an issue. It is probably very unlikely that control group participants received anything like Scared Straight but it was not specifically addressed by authors of the reports.

Effects of interventions

Findings from the individual studies

Whether relying on the actual data reported or measures of statistical significance, the nine trials do not yield evidence for the effectiveness of Scared Straight and other juvenile awareness programs on subsequent delinquency. In the first such study, the Michigan Department of Corrections found that 43% of the experimental group reoffended, compared to only 17% of the control group (Michigan D.O.C. 1967). No test of statistical significance was reported by the trialists. We performed a Chi2 test, which indicated no statistical significance for this outcome, likely due to the low statistical power of the sample. The original document does not comment on this large percentage difference.

In Illinois, the outcomes were also negative in direction but not statistically significant, with 17% of the experimental participants being recontacted by police in contrast to 12% of the controls (GERP&DC 1979). The authors concluded that "based on all available findings one would be ill-advised to recommend continuation or expansion of the juvenile prison tours. All empirical findings indicate little positive outcome, indeed, they may actually indicate negative effects" (p. 19). Researchers reported no effect for the program on two attitude tests (Jesness Inventory, Piers Harris Self-Concept Scale). In contrast, interview and mail surveys of participants and their parents and teachers indicated unanimous support for the program (p. 12). Researchers also note how positive and enthusiastic inmates were about their efforts.

The second Michigan study also reported very little difference between the intervention and control group (Yarborough 1979). The average offense rate for program participants, however, was 0.69 compared to 0.47 for the control group. As Yarborough (p. 14) pointed out, "…the inescapable conclusion was that youngsters who participated in the program, undergoing the JOLT experience, did no better than their control counterparts."

The only positive findings, though not statistically significant, were reported in Virginia (Orchowsky 1981). Although the difference at six months was not statistically significant (39% of controls had new court intakes versus 41% of experimental participants), they favor the experimental participants at nine and 12 months. The investigators noted, however, that the attrition rates in their experiment were dramatic. At nine months, 42% of the original sample dropped out, and at 12 months, 55% dropped out. The investigators conducted analyses that seemed to indicate that the constituted groups were still comparable on selected factors such as race and age.

A study of the Face-to-Face program in Texas also reported little effect for these interventions (Vreeland 1981). Vreeland 1981 reported that the control participants outperformed the three treatment groups on official delinquency (28% delinquent for control versus 39% for prison orientation plus counseling versus 36% for prison only versus 39% for counseling only). This more robust measure contradicts data from the self-report measures used, which suggest that all three treatment groups did better than the no-treatment controls. None of these findings reached a level of statistical significance. Viewing all the data, Vreeland 1981 concluded that there was no evidence that Face-to-Face was an effective delinquency prevention program. He finds no effect for Face-to-Face on several attitudinal measures, including the Attitudes Toward Obeying Law Scale.

Finckenauer 1982 reported that 41% of the children and young people who attended the Scared Straight program in New Jersey committed new offenses, while only 11% of controls did, a difference that was statistically significant. He also reported that the program participants committed more serious offenses and that the program had no impact on nine attitude measures with the exception of a measure called 'attitudes toward crime.' On this measure experimental participants did much worse than control participants. We deal with Finckenauer's own concerns about randomization integrity in this study in a sensitivity analysis.

Additional evidence of a possible harmful effect can be found in the evaluation of the California SQUIRES program (Lewis 1983). Lewis 1983 reported that 81% of the program participants were arrested compared to 67% of the controls. He also found that the program did worse with seriously delinquent youths, leading him to conclude that such children and young people could not be "turned around by short-term programs such as SQUIRES…a pattern for higher risk youth suggested that the SQUIRES program may have been detrimental" (p. 222). The only deterrent effect for the program was the average length of time it took to be rearrested: 4.1 months for experimental participants and 3.3 months for control participants. Data were reported on eight attitudinal measures, and Lewis reported that the program favored the experimental group on all of them, again underscoring the difficulty of achieving behavioral change even when positively affecting the attitudes of juvenile delinquents.

Locke and his colleagues reported little effect of the Juvenile Education Program in the Kansas State Prison (Locke 1986). Both groups improved from pretest to post-test but the investigators concluded that there were no differences between experimental and control groups on any of the crime outcomes measured. Investigators also reported no effect for the program on the Jesness and Cerkovich attitude tests.

Finally, little difference was found between experimental and control participants in the Mississippi Project Aware study (Cook 1992). For example, the mean offending rate for control participants at 12 months was 1.25 versus 1.32 for Project Aware participants. Both groups improved from 12 to 24 months, but the control mean offending rate was still lower than the experimental group. The investigators concluded that, "attending the treatment program had no significant effect on the frequency or severity of subsequent offenses" (p. 97). The investigators also reported on two educational measures: school attendance and dropout. Curiously, they report an effect for the program on school dropout data, but note that "...it is not clear how the program succeeded in reducing dropout rates..." (p. 97).

Meta-analysis

For each study, we extracted all of the relevant crime outcome data. Our protocol included an organization of analyses by examining official reports (from government administrative records) distinct from self-reported criminality (obtained from investigator-administered survey questionnaires). Given that we expected a diverse number of measures of crime to be reported, the protocol called for us to organize it into four indexes that would be most relevant to policy and practice. These included prevalence rates (what percentage of each group reoffended or did not?), average incidence rates (what was the average number of offenses or other incidents per individual in each group?), offense severity rates (what was the average severity of offenses per individual in each group?) and latency (how long was the average return to crime or failure delayed per individual in each group?). As Table 1 shows, however, few measures except for prevalence were reported.

Given the limitation of the data, we conducted one meta-analysis. We report the crime outcomes for official measures at the first-effect or first (and usually the only) follow-up interval period reported. Each analysis focused on proportion data (that is, the proportion of each group reoffending), as the outcomes reporting means or averages were sparse and often did not include the standard deviations. Thus, because the data relied on dichotomous outcomes, both analyses report ORs and 95% CIs for each study. As a sensitivity analysis, we assume both random-effects and fixed-effect models for treatment effects across the studies.

Immediate post-treatment effects for reoffending rates: official measures

The analysis of the data in comparison Table 1 from the seven studies reporting reoffending rates shows that intervention increases the crime or delinquency outcomes at the first follow-up period. Assuming either a fixed-effect or random-effects model does not change its overall negative impact. Using a fixed-effect model, the OR was 1.68 (95% CI 1.20 to 2.36). Heterogeneity statistics should be interpreted with caution given that only seven studies were included in the meta-analysis (Chi2 = 8.49, P value = 0.20, I2 = 29%) (Analysis 1.1). The mean OR assuming a random-effects model was similar at 1.72 (95% CI 1.13 to 2.62); heterogeneity statistics were nearly identical (Chi2 = 8.50, P value = 0.20, I2 = 29%) (Analysis 1.2). Both fixed-effect OR and random-effects OR are statistically significant; the intervention increases the odds of offending by between 1.6 to 1 and 1.7 to 1.

Sensitivity analysis 1. Excluding Finckenauer study

We excluded the Finckenauer study from the analysis because of its randomization problems. Finckenauer reported that only eight of the 11 referring agencies correctly followed the randomization procedures. His reanalyses taking these randomization problems into account still indicated a negative impact. Nonetheless, we determined to examine the impact of this study on the meta-analytic findings. Given the little difference in OR whether assuming a fixed-effect or random-effects model, we conducted a meta-analysis assuming a random-effects model. Given that the Finckenauer study reported the largest negative effects for the program, it is not surprising that the OR decreased. However, it is still negative in direction at 1.47, and statistically significant (95% CI 1.03 to 2.11). Heterogeneity statistics should be interpreted with caution given the small number of studies (Tau2 = 0.00; Chi2 = 4.25, degrees of freedom (df) = 5, P value = 0.51; I2 = 0%) (Analysis 1.3).

Sensitivity analysis 2. Excluding Yarborough study

We excluded the Yarborough study because of its deletion of no-shows postrandomization from analysis of the results, indicating a potential for high attrition bias. Yarborough did not report any analyses to indicate how this affected the remaining sample. We again assumed a random-effects model. The deletion of this study did not alter the overall negative impact of these programs, as the OR was 1.96. This is statistically significant (95% CI 1.25 to 3.08). Heterogeneity statistics should be interpreted with caution given the small number of studies (Tau2 = 0.06; Chi2 = 6.25, df = 5, P value = 0.28; I2 = 20%) (Analysis 1.4).

Although the methodological limitations of the studies warrant our sensitivity analyses, their exclusion did not alter the main conclusion of the meta-analyses: a significant negative impact of the program.

Sensitivity analysis 3. Excluding both Finckenauer and Yarborough studies

We excluded both the Finckenauer and Yarborough studies to see how this affected the overall meta-analysis. As Analysis 1.5 shows, even with two studies removed for sensitivity analysis, the overall effect of the intervention in the five remaining studies shows a 'criminogenic' effect that is statistically significant, that is, favors the control group not Scared Straight (OR 1.69, 95% CI 1.10 to 2.58). Heterogeneity statistics should be interpreted with caution given only five studies are in the analysis (Tau2 = 0.00; Chi2 = 2.94, df = 4, P value = 0.57, I2 = 0%).

Discussion

Summary of main results

These randomized trials, conducted over a 25-year period in eight different US states, provide evidence that Scared Straight and other 'juvenile awareness' programs are not effective as a stand-alone crime prevention strategy. More importantly, they provide empirical evidence - under experimental conditions - that these programs likely increase the odds that children exposed to them will commit offenses in future. Despite the variability in the type of intervention used, ranging from harsh, confrontational interactions to tours of the facility, they converge on the same result: an increase in criminality in the experimental group when compared to a no-treatment control. Doing nothing would have been better than exposing juveniles to the program.

We noted that the other two trials that did not report prevalence data for the meta-analysis also reported no effect for the intervention (Locke 1986; Cook 1992). Indeed, the mean data from the Mississippi study was also negative in direction, and the Kansas investigators reported that the self-reported data showed a negative impact.

Overall completeness and applicability of evidence

Given that the seven trials used in the meta-analysis were conducted in six states using different conceptions of the intervention underscore the high external validity of these findings. However, note that all trials were of US programs, and no trial was reported after 1992. Indeed, no trial included in the meta-analysis was reported since 1983.

Quality of the evidence

Nine randomized trials were included in the review; only randomized trials, if implemented with good fidelity, produce statistically unbiased effects. However, the nine studies were not exemplars of trial quality. These were small studies, with very few providing convincing evidence that they reduced bias threats as measured by the Cochrane 'Risk of bias' tool (Figure 1). In fact, for some of the bias threats, the trials were rated with a great deal of uncertainty due to the lack of descriptive data in the report. However, three sensitivity analyses were conducted, the first dropping the study that experienced the greatest threat of bias due to randomization compromise (Finckenauer 1982), the second study that lost a considerable number of participants postrandomization (Yarborough 1979), and the third dropping them both. The effect sizes remained stable in all three analyses, indicating that the negative effect for Scared Straight and other juvenile awareness findings is robust.

Potential biases in the review process

Although we believe we have identified all relevant RCTs, it is possible that studies in languages other than English and not indexed in English language databases could have been missed. In addition, it is possible that the sensitivity of our search could have been increased; for example, by using additional indexing terms specific to the databases we searched and using truncation to ensure we searched for word variations. A revised search strategy will be developed for the next update.

Agreements and disagreements with other studies or reviews

The results of this review converge with the findings from many other narrative or quantitative reviews. This is expected as the reviews generally consider the same studies. For example, reviewers of research on the effects of crime prevention programs have not found deterrence-oriented programs, such as Scared Straight, effective (Lipsey 1992; Lundman 1993; Sherman 1997). In fact, the University of Maryland's well-publicised review of over 500 crime prevention evaluations listed Scared Straight as one program that 'doesn't work' (Sherman 1997). These findings also mirror a meta-analysis of juvenile prevention and treatment programs by Lipsey 1992, who indicated that the effect size for 11 "shock incarceration and 'Scared Straight' programs" was -0.14 (or produced about 7% higher recidivism rates in experimental participants than control participants assuming a 50% baseline).

The one disagreement, in terms of syntheses of evidence, is with the US Department of Justice's CrimeSolutions.Gov registry of effects on crime policies and programs (US Department of Justice 2012). The Crime Solutions project has rated the evidence as inconclusive. There are two reasons for the discrepancy. The first is that the Crime Solutions rating scheme relies on statistical significance to determine whether there is evidence of effect; indeed, some of the program evaluations included here were underpowered due to small sample size and did not report a statistically significant finding. Second, the Crime Solutions project is defining Scared Straight narrowly, as its initial iteration in New Jersey defined it, in contrast with the broader definition of Scared Straight and similar "kids visit prison" programs used here. Thus, while Crime Solutions is only considering a small set of studies that examined a narrowly defined intervention (known as Scared Straight), this review includes nine program evaluations that would fall under a broader heading of juvenile awareness programs.

Authors' conclusions

Implications for practice

The strong indication here is that these programs have a harmful effect. This raises a dilemma for policymakers. Criminological interventions, when they cause harm, are not just toxic to the participants. They cause more harm to citizens who were not part of the experiment because of the increase in criminal victimization. Policymakers should take steps to build the kind of research infrastructure within their jurisdiction that could rigorously evaluate criminological interventions to ensure they are not harmful to the very citizens they aim to help. We believe that our updated review places the onus on every jurisdiction to show how their current or proposed program is different than the ones studied here. Given that, they should then put in place rigorous evaluation to ensure that no harm is caused by the intervention.

Some literature indicates the program can have a positive effect on the inmates involved in the prison visits and that argument is sometimes used to legitimize use of the program. These arguments are undoubtedly used under the assumption that the program does no harm. In light of the findings of this review, assertions that Scared Straight and similar programs ought to be used because they have other positive effects raises ethical questions about potentially harming children (and others in the community who may be victimized) in order to accomplish other important, but latent, goals.

The authors have received communications from different prison facilities that are using a juvenile awareness program. One argument used to sustain such programs is that the research reported here does not apply to their particular program. Our recommendation is that correctional research units, either at the facility or at a regional or national government level, collaborate with program staff to conduct a rigorous evaluation. If such units do not exist or cannot conduct their own study, we suggest they collaborate with a local university, college or research firm that could undertake this work to ensure that the program is working as planned and not unintentionally causing more harm than good. 

Correctional administrators sometimes ask whether our results are relevant to their particular program. For example, inmates running the program may go outside the prison to speak at schools about their life experiences. Our review only looked at programs involving visits of young people to prisons, and, as far as we know, no review has examined juvenile awareness interventions that involve offenders leaving prison grounds to speak to children at school. We are not aware of any controlled studies testing it.

We receive periodic correspondence from concerned citizens about how to get a juvenile who is in trouble with the law into a Scared Straight program. We cannot, in good conscience, recommend this program. Our response to these well-meaning citizens is to refer them to national, regional or local centers that specialize in youth crime prevention services.

Implications for research

One question that continues to arise about these findings is why Scared Straight and similar programs seem to lead to more crime rather than less in participants. What is the critical mechanism? Although there were many good post-hoc theories about this, none of the evaluations were structured to provide the kind of mediating variables necessary to respond to this in the context of a systematic review (Petrosino 2000). One explanation may be 'peer contagion' (Dishion 1999). According to this theory, any positive impact by an intervention for youth might be offset by processes of peer influence that occur when deviant youths are allowed to interact with each other in groups, such as what occurs in Scared Straight and similar programs. This would need to be explicitly tested in careful evaluation studies to confirm as a potential mechanism for harmful effects.

We plan to update this review again within 36 months to incorporate any new studies or respond to cogent criticisms. Given that we found only nine studies (and only seven were used in the meta-analysis), we were cautious not to propose the use of moderating variables in subsequent analyses. Initially we wondered if one program factor might have particular salience, which was the degree of harshness in the inmate presentations. It may be that the more brutal and vulgar the presentation, the more that it causes a type of 'backfire' effect, producing in the juveniles the very behavior it seeks to deter. However, when looking at this more closely, we discovered that one trial involving a tour of a reformatory with no presentation reported one of the largest negative effects (Michigan D.O.C. 1967).

This review has led us to consider two others, contingent on future funding. 'Shock value'-type interventions are tried across many fields. For example, high school students are sometimes shown horrific footage of car accidents in order to deter them from drinking and driving. In industrial arts classes, students are shown films of what occurs when safety glasses are not worn; this is often graphic and is designed to increase compliance with such regulations. There are many other examples across fields. But is there any evidence that any of these 'shock value' interventions work? Or do they produce disappointing, or even toxic, results as we have reported here? The early evidence is not promising, as fear appeals in reducing drug and alcohol among young people have been described in at least one review as 'disappointing' (Prevention First 2008).

It may be true that Scared Straight and similar programs do not work because they only convey a threat that juveniles do not think will be carried out. What about the evidence for deterrence if it is not a third-party threat but actual involvement in the juvenile justice system? There has been a wide range of randomized trials that test for the effects of official processing in juvenile courts with some other intervention (such as diverting the child from such processing). Is there evidence that the delivery of a threat - official system processing - deters future criminal behavior? Petrosino 2010 examined 29 randomized trials that evaluated the effects of some diversionary alternative (services or outright release) and compared it to official processing or progression deeper into the juvenile justice system. That review, published by the Campbell Collaboration, also indicated that formal system processing or progression had no crime deterrent effect, and, in some instances, increased crime in contrast to diversionary alternatives.

Acknowledgements

The original review in 2002 was principally supported by USD5000 from a grant from the Smith-Richardson Foundation to the University of Pennsylvania Graduate School of Education (Robert Boruch, Principal Investigator). It received partial support from a Mellon Foundation grant to the Center for Evaluation, Initiatives for Children Program at the American Academy of Arts & Sciences (Frederick Mosteller, Principal Investigator) and a grant from the UK Home Office to Cambridge University Institute of Criminology (David Farrington, Principal Investigator). The latter two sources supported Anthony Petrosino's time during work and use of his office and computer at the American Academy of Arts & Sciences.

The Criminal Justice Collection at Rutgers University's Center for Law and Justice and the Gutman Library at the Harvard Graduate School of Education facilitated interlibrary loan requests. We thank Phyllis Schultze and Carla Lillvik for their expertise, patience and assistance.

We appreciated the guidance and editorial comments of Dr. Jane Dennis, Professor Geraldine Macdonald (Co-ordinating Editor), Dr. Julian Higgins, Dr. Stuart Logan and other members of the Cochrane Developmental, Psychosocial and Learning Disorders Group. Criticisms of the original review by Professor Robert Boruch, Sir Iain Chalmers, Dr. Phoebe Cottingham, Professor Lyn Feder, Professor Hiroshi Tsutomi and Professor Joan McCord also helped.

We acknowledge the contributions of John Buehler to the original review. John passed away in 2003.

The original review and the review update were produced within the Cochrane Developmental, Psychosocial and Learning Problems Group and the Campbell Crime and Justice Group. The 2012 update was financially supported by the Campbell Collaboration, Oslo, Norway. We appreciate the assistance of Eammon Noonan, Chief Executive Officer, in facilitating this funding.

Data and analyses

Download statistical data

Comparison 1. Intervention versus control, crime outcome
Outcome or subgroup titleNo. of studiesNo. of participantsStatistical methodEffect size
1 Postintervention - group recidivism rates - official measures only (fixed-effect)7794Odds Ratio (M-H, Fixed, 95% CI)1.68 [1.20, 2.36]
2 Postintervention - group recidivism rates - official measures only (random-effects)7794Odds Ratio (M-H, Random, 95% CI)1.72 [1.13, 2.62]
3 Sensitivity analysis - excluding Finckenauer study6713Odds Ratio (M-H, Random, 95% CI)1.47 [1.03, 2.11]
4 Sensitivity analysis - excluding Yarborough study6567Odds Ratio (M-H, Random, 95% CI)1.96 [1.25, 3.08]
5 Sensitivity analysis - excluding both Finckenauer and Yarborough studies5486Odds Ratio (M-H, Random, 95% CI)1.68 [1.10, 2.58]
Analysis 1.1.

Comparison 1 Intervention versus control, crime outcome, Outcome 1 Postintervention - group recidivism rates - official measures only (fixed-effect).

Analysis 1.2.

Comparison 1 Intervention versus control, crime outcome, Outcome 2 Postintervention - group recidivism rates - official measures only (random-effects).

Analysis 1.3.

Comparison 1 Intervention versus control, crime outcome, Outcome 3 Sensitivity analysis - excluding Finckenauer study.

Analysis 1.4.

Comparison 1 Intervention versus control, crime outcome, Outcome 4 Sensitivity analysis - excluding Yarborough study.

Analysis 1.5.

Comparison 1 Intervention versus control, crime outcome, Outcome 5 Sensitivity analysis - excluding both Finckenauer and Yarborough studies.

Appendices

Appendix 1. Search terms used for all databases

'scared straight' 
Prison orientation OR prison tour OR prison visit
Jail orientation OR jail tour OR jail visit
Reformatory orientation OR reformatory tour OR reformatory visit
Reformator* orientation OR reformator* tour OR reformator* visit
'prisoner run' OR 'offender run' OR 'inmate run' 
'prison awareness OR 'prison aversion' OR 'juvenile awareness' 
'rap session' AND prisoner
'rap session' AND lifer
'rap session' AND inmate
'rap session' AND offender
speak out AND prisoner
speak out AND lifer
speak out AND inmate
speak out AND offender
confrontation AND prisoner
confrontation AND lifer
confrontation AND inmate
confrontation AND offender

Appendix 2. Search methods used for original review

For the original review we firstly identified randomized experiments from a larger review of field trials in crime reduction by the first author (Petrosino 1997). Petrosino used the following methods to find more than 300 randomized experiments: handsearch (that is, visually inspecting the entire contents) of 29 leading criminology or social science journals; checking the citations reported in the 'Registry of Randomised Experiments in Criminal Sanctions' (Weisburd 1990); detailed electronic searches of Criminal Justice Abstracts, Sociological Abstracts and Social Development and Planning Abstracts (Sociofile), Education Resource Information Clearinghouse (ERIC), and Psychological Abstracts (PsycINFO); searches by information specialists of 18 bibliographic databases, including the National Criminal Justice Reference Service (NCJRS); an extensive mail campaign with over 200 researchers and 100 research centers; published solicitations in association newsletters; tracking of references in over 50 relevant systematic reviews and literature syntheses; and tracking of references in relevant bibliographies, books, articles and other documents. More detail about these search methods can be found in Petrosino 1995 and Petrosino 1997. The citations found in Petrosino 1997 covered literature with a publication date between January 1, 1945 and December 31, 1993. Seven randomized trials meeting the eligibility criteria were identified from this sample.

Second, we augmented this work with searches designed to uncover experiments missed by Petrosino 1997 and to cover more recent literature (1994 to 2001). These methods included: broad searches of the Campbell Collaboration Social, Psychological, Educational & Criminological Trials Register (C2-SPECTR) developed by the UK Cochrane Centre and then supervised by the University of Pennsylvania Graduate School of Education (Petrosino 2000a); check of citations from more recent systematic or traditional reviews to provide coverage of more recent studies (for example, Sherman 1997; Lipsey 1998); citation checking of documents relevant to Scared Straight and similar programs (for example, Finckenauer 1999); email correspondence with investigators; and broad searches of the Cochrane Controlled Trials Register (CENTRAL) in The Cochrane Library (Issue 1, 2002). By broad searches, we mean that we tried to first identify studies relevant to crime or delinquency and then visually scanned the citations or abstracts to see if any were relevant to this intervention.

Third, we decided to conduct a more specific search of the 14 additional electronic databases accessible to the authors and relevant to the topic area. Many of these include published and unpublished literature (for example, dissertations or government reports). Searches were done online using available Harvard University resources or other databases freely searchable via the Internet. Several trips were made to the University of Massachusetts, Lowell to use Criminal Justice Abstracts and other Silver Platter databases not accessible at Harvard University or via the Internet. The bibliographic data bases and the years searched were:

  • Criminal Justice Abstracts, 1968 to September 2001;

  • Current Contents, 1993 to 2001;

  • Dissertation Abstracts, 1981 to August 2001;

  • Education Full Text, June 1983 to October 2001;

  • ERIC (Education Resource Information Clearinghouse) 1966 to 2001;

  • GPO Monthly (Government Printing Office Monthly), 1976 to 2001;

  • MEDLINE 1966 to 2001;

  • National Clearinghouse on Child Abuse and Neglect, to 2001;

  • NCJRS (National Criminal Justice Reference Service), to 2001;

  • Political Sciences Abstracts, 1975 to March 2001;

  • PAIS International (Public Affairs Information Service), 1972 to October 2001;

  • PsycINFO (Psychological Abstracts) 1987 to November 2001;

  • Social Sciences Citation Index, February 1983 to October 2001;

  • Sociofile (Sociological Abstracts and Social Planning And Development Abstracts) January 1963 to September 2001.

We anticipated that the amount of literature on Scared Straight would be of moderate size, and that our best course of action would be to identify all citations relevant to the program and screen them for potential leads to eligible studies. This removed the need to include keywords for identifying randomized trials (for example, 'random assignment') in our searches. After several trial runs, we found that nearly all documents used phrases like Scared Straight or 'juvenile awareness' in the title or abstract of the citation. Therefore, the following searches were run in each relevant database to identify relevant citation, and did not vary:

  • 'scared straight';

  • ('prison or jail or reformatory or institution') and ('orientation or visit or tour');

  • 'prisoner run' or 'offender run' or 'inmate run';

  • 'prison awareness' or 'prison aversion' or 'juvenile awareness';

  • ('rap session' or 'speak out' or 'confrontation') and ('prisoner' or 'lifer' or 'inmate' or 'offender').

Feedback

Feedback given on original review in 2003 - Meaning of equivalence at baseline

Summary

My question relates to information in the table describing the methodological quality of the included studies, where reference is made to 'tests for equivalence' at baseline. What does this mean? My concern is that it may refer to the use of tests of statistical significance to compare baseline characteristics following randomisation, a process which Altman (1985) has pointed out is absurd. Either chance (random allocation) was used to generate the comparison groups (in which case it makes no sense to use statistical tests to assess the probability that any differences reflect chance), or chance (random allocation) was not used.

Please clarify this, and provide more information, for each trial, about how the allocation schedule was generated, and what measures were taken to conceal the schedule from those recruiting participants into the trial. If this information has not been supplied by the authors of the reports, please make this explicit.

Ref. 1 - Altman DG. Comparability of randomised groups. Statistician 1985;34:125-136.

I certify that I have no affiliations with or involvement in any organisation or entity with a direct financial interest in the subject matter of my criticisms.

Reply

Reply to Sir Iain Chalmers' comment on our review, by Anthony Petrosino and Carolyn Turpin-Petrosino

We apologize for the unsatisfactory delay in responding to Sir Iain Chalmers' comment on our review. His question is most appreciated, and inspired us to query some of our more methodologically- and statistically-minded colleagues for advice. We have now had ample opportunity to mull over these responses.

He asked that we clarify what is meant by 'tests for equivalence at baseline'. Indeed, our reference is to statistical tests that are conducted by the experimental investigators to determine if randomization produced equivalent groups before the intervention or treatment is introduced. In his comment, Dr. Chalmers is correct when he states (referencing Altman 1985) that such 'pretests of group equivalence' are illogical because of randomization. But this only applies when we have confidence that randomization was carried out with full integrity.

Unfortunately, thorough description of how randomization was done and what efforts were taken to conceal such allocation are often missing in reports of experimental studies. This is particularly true of trials reported several decades ago; in our review, all of the studies were reported before 1993 and at least one was briskly reported in a short government document circa 1967. Sure enough, concealment and allocation was rated as 'unknown' in eight of the nine trials we included in our systematic review. Pretests of group equivalence increase our confidence (but does not guarantee) that randomization was successfully implemented.

Missing information is not the only problem. It is also the case that allocation in many criminological experiments is often left out of the hands of the investigators and is actually conducted by practitioners or treatment providers. Such individuals often have a good reason to corrupt the allocation schedule to ensure that particular cases end up in a certain group. Pretests of group equivalence are one way to determine if an intentional subversion of the allocation scheme has resulted in unhappy configurations of the groups.

Besides missing information and covert manipulation of allocation, there is another problem with criminological experiments that pretests of group equivalence can assist. Many justice experiments have very small samples. For example, the Locke et al study in our review (though it was not included in the meta-analysis) had 16 participants in each group. The laws of randomization naturally follow the laws of sampling probability. If you flip a valid coin 32 times, you may end up with 22 heads and 10 tails. Randomizing 32 participants to study groups may result in the experimental group receiving far more boys than girls when compared to the control group. To the extent that males are more likely to commit another crime than females, the experimental group is at a distinct disadvantage. Flipping a valid coin several hundred times is more likely to produce a near 50/50 split of heads and tails than 32 flips; random allocation of several hundred participants is more likely to produce balanced groups than assignment of 32 participants. Pretests of group equivalence, in this case, can identify situations where unintentional bias has produced unhappy configurations of groups.

The methodological quality table contains our own subjective language of whether we thought the pretest results were 'satisfactory.' This should be changed. In our update of the Cochrane review, we will simply list if the pretests were done and whether the experimental investigators reported that pretest equivalence was confirmed.

Notes

i. We especially thank Dr. Mark Lipsey and Dr. David Weisburd, among others, for their valuable input.
ii. Of course, experimental investigators who have good a priori knowledge of a particular variable especially relevant to the outcome, can block on that variable to ensure equal distribution across study groups irrespective of randomization (in essence, they can randomize boys and girls separately into the study groups).

Contributors

Iain Chalmers, Director, UK Cochrane Centre, ichalmers@cochrane.co.uk

What's new

Last assessed as up-to-date: 30 June 2012.

DateEventDescription
1 March 2013New citation required but conclusions have not changedUpdated searches found no new studies suitable for inclusion. Listed six new studies in 'excluded studies' section. Added 'Risk of bias' tables. Conducted sensitivity analysis (excluding both Finckenauer and Yarborough studies)
1 January 2012New search has been performedUpdated all searches

History

Protocol first published: Issue 4, 2000
Review first published: Issue 2, 2002

DateEventDescription
22 September 2008AmendedConverted to new review format.
25 May 2004AmendedResponse to feedback added: 25/05/04
26 February 2003Feedback has been incorporatedFeedback added: 26/02/03
1 March 2002New search has been performedMinor update: 01/03/02
27 February 2002New citation required and conclusions have changedSubstantive amendment

Contributions of authors

Anthony Petrosino: searching for studies, screening studies, extracting data, conducting analyses, drafting review.
Carolyn Turpin-Petrosino: screening studies, extracting data, drafting review.
Meghan Hollis-Peel, updating searches, screening studies, drafting review.
Julia Lavenberg: updating searches, screening studies, drafting review.

Declarations of interest

Anthony Petrosino - the original review was supported in part by a consultancy to me from the University of Pennsylvania. This update is being supported in part by funding from the Campbell Collaboration, based in Oslo, Norway. I also received an honorarium in 2004 to 2005 for contributing an article summarizing Scared Straight to a special issue on randomized experiments by the Annals of the American Academy of Political and Social Science.
Carolyn Turpin-Petrosino - as spouse to lead author, I am also a beneficiary of funding Anthony received as a consultant or as an article contributor to the Annals of the American Academy of Political and Social Science.
Julia Lavenberg - I was supported as a consultant for some work on the update.
Meghan E. Hollis-Peel - I was supported as a consultant for some work on the update.

Sources of support

Internal sources

  • American Academy of Arts and Sciences, USA.

  • Harvard Graduate School of Education, USA.

External sources

  • Smith-Richardson Foundation grant (to University of Pennsylvania), USA.

  • Mellon Foundation grant (to AAA&S, Center for Evaluation), USA.

  • Home Office, Research & Statistics Directorate (to Cambridge University), UK.

Differences between protocol and review

The main difference between the protocol and the review is that the protocol anticipated a range of outcomes (prevalence, incidence, severity and latency) at different time intervals, and the review only focused on prevalence outcomes (for example, percentage of youth in each group getting re-arrested) reported at first post-treatment follow-up.

Notes

A publication based on the preliminary results of the original review was published in A. Petrosino, C. Petrosino and J. Finckenauer, 2000, Crime & Delinquency, 46, 1, 354-79.

The review is published in both the Cochrane and the Campbell Libraries.

Characteristics of studies

Characteristics of included studies [ordered by study ID]

Cook 1992

MethodsQuasi-random assignment - researchers numbered court files and assigned all odd numbered ones to intervention group
Participants176 juvenile delinquents ages 12-16 years under jurisdiction of 1 Mississippi county youth court, 36% white, 100% male
InterventionsEducational, prisoner-run 5-hour session, designed to be nonconfrontational
Outcomes

12 and 24 months' follow-up of official court record data, average offending rates and severity of offense

School attendance and school dropout

NotesThe attrition gives us cause for concern, particularly with no tests for equivalence. But the major problem with the study is the failure of the investigators to report the necessary standard deviations for the meta-analysis. No standard deviations reported with any mean data, no group percentages, attempts to retrieve these data from author and other primary documents failed. All available data seem to indicate a slightly negative impact for the program on crime measures
Risk of bias
BiasAuthors' judgementSupport for judgement
Random sequence generation (selection bias)High riskQuasi-random allocation using odd-even assignment of case files (with initial numbering quasi-random - all cases numbered consecutively) Some breakdown is reported but actual percentage is unknown; cases were dropped. No test for equivalence reported before or after attrition
Allocation concealment (selection bias)Unclear riskNot reported
Blinding (performance bias and detection bias)
All outcomes
Unclear riskNo description of monitoring of control group to determine if compensation was an issue. Probably unlikely that the control group received anything else but not specifically addressed
Blinding of participants and personnel (performance bias)
All outcomes
High riskNot done
Blinding of outcome assessment (detection bias)
All outcomes
Unclear riskData retrieved from court system. No other information provided
Incomplete outcome data (attrition bias)
All outcomes
High risk24% lost in follow-up, no analysis to ensure groups still equivalent
Selective reporting (reporting bias)Low riskMasters' thesis with many findings in it, including the negative result for the intervention. The type of data reported could not be included, however, in the analyses
Other biasLow riskNo problems reported with implementation

Finckenauer 1982

MethodsRandom assignment
Participants81 delinquent or children ages 11-18 years at risk for delinquency, 50% had prior record of offending, 40% were white, 80% male
Interventions1 visit, a confrontational rap session lasting approximately 3 hours with inmates serving life sentence
Outcomes

6-month follow-up of official complaints, arrests or adjudications. Severity of offense

Attitudes:

  • toward criminals

  • toward crime

  • toward law

  • toward justice

  • toward police

  • toward prison

  • toward punishment

  • self-image

NotesRandomization breakdown is cause for concern. Principal investigator does report additional analyses for agencies that followed protocol: 31% of the experimental group recidivated compared to 17% of the control group
Risk of bias
BiasAuthors' judgementSupport for judgement
Random sequence generation (selection bias)High riskRandomization broke down, 6 of the 11 referral agencies violated assignment protocol, test for equivalence showed 59% of the experimental group had a prior record, only 40% of the control group
Allocation concealment (selection bias)Unclear riskNot described
Blinding (performance bias and detection bias)
All outcomes
Unclear riskNo description of monitoring of control group to determine if compensation was an issue. Probably unlikely that the control group received anything else but not specifically addressed
Blinding of participants and personnel (performance bias)
All outcomes
High riskNot done
Blinding of outcome assessment (detection bias)
All outcomes
Low riskResearchers collected the data from court files, not program staff
Incomplete outcome data (attrition bias)
All outcomes
Low riskNone reported
Selective reporting (reporting bias)Low riskIt seems unlikely given the full length book treatment and the amount of findings reported
Other biasLow riskNo problems with implementation reported

GERP&DC 1979

MethodsRandom assignment
Participants161 delinquent or children at risk for delinquency, 100% male, 84% white, ages 13-18 years
InterventionsConfrontational rap session with inmates
Outcomes

5-15 months' follow-up of contacts with police

Piers Harris Children's Self-Concept Scale

Jesness Inventory

NotesNothing in the report seems to indicate that the findings should be questioned
Risk of bias
BiasAuthors' judgementSupport for judgement
Random sequence generation (selection bias)Unclear riskRandom assignment, no further information
Allocation concealment (selection bias)Unclear riskNo other description of randomization
Blinding (performance bias and detection bias)
All outcomes
Unclear riskNo description of monitoring of control group to determine if compensation was an issue. Probably unlikely that the control group received anything else but not specifically addressed
Blinding of participants and personnel (performance bias)
All outcomes
High riskNot done
Blinding of outcome assessment (detection bias)
All outcomes
Unclear riskStudy relied on subsequent police reports, but no information provided on blinding of outcome assessors
Incomplete outcome data (attrition bias)
All outcomes
Low riskNo attrition reported
Selective reporting (reporting bias)Low riskUnknown
Other biasLow riskNo implementation problems reported

Lewis 1983

MethodsRandom assignment
Participants108 juvenile delinquents from 2 California counties, most with extensive prior record, ages 14-18 years, 100% male, mostly non-white
InterventionsTotal 3 visits (1 per week) including confrontational rap sessions, guided tours of prison and interaction with prisoners, review of pictures of prison violence
Outcomes

12-month follow-up of percentage arrested, average number of arrests, percentage charged, average number of charges, charges by type of offense, offense severity, time to first arrest

Attitudes:

  • toward police

  • toward school

  • toward crime

  • toward prison

  • toward work camp

Semantic Differential Test

Notes

Over 100 moderating analyses performed on the data

There is nothing in the study report to support any lack of confidence in the observed findings

Risk of bias
BiasAuthors' judgementSupport for judgement
Random sequence generation (selection bias)Low riskTest for equivalence is satisfactory but age slightly favors the experimental group
Allocation concealment (selection bias)Unclear riskNot stated
Blinding (performance bias and detection bias)
All outcomes
Unclear riskNo description of monitoring of control group to determine if compensation was an issue. Probably unlikely that the control group received anything else but not specifically addressed
Blinding of participants and personnel (performance bias)
All outcomes
High riskNot used
Blinding of outcome assessment (detection bias)
All outcomes
Unclear risk2 researchers collected court data. Unknown if they were blind to youth conditions
Incomplete outcome data (attrition bias)
All outcomes
Low risk40% of an already small sample lost in follow-up, leaving 32 in the study
Selective reporting (reporting bias)Low riskThis was based on the authors' Masters thesis. Many results reported, including null findings for intervention. But the outcome data were not possible to be used in the subsequent analysis
Other biasLow riskNo implementation problems reported

Locke 1986

MethodsRandom assignment
Participants53 juvenile delinquents ages 14-19 years on probation from 3 Kansas counties, 65% white, 100% male
InterventionsNon-confrontational, educational interaction, tried to match juvenile with inmate
OutcomesMinimum 6-month follow-up of self-reported crime and juvenile court and police records of official offending
Notes

No standard deviations reported with any mean data, no group percentages, attempts to retrieve these data from author and other primary documents failed.

The study appears to have severe attrition, limiting our confidence. The principal investigator reported no effect for treatment but do not provide enough data for computation of odds ratios or weighted mean differences

Risk of bias
BiasAuthors' judgementSupport for judgement
Random sequence generation (selection bias)Low riskRandomization used, test for equivalence satisfactory (though not stated if done after attrition)
Allocation concealment (selection bias)Unclear riskNot stated
Blinding (performance bias and detection bias)
All outcomes
Unclear riskNo description of monitoring of control group to determine if compensation was an issue. Probably unlikely that the control group received anything else but not specifically addressed
Blinding of participants and personnel (performance bias)
All outcomes
High riskNot used
Blinding of outcome assessment (detection bias)
All outcomes
Unclear risk2 researchers collected court data. Unknown if they were blind to youth conditions
Incomplete outcome data (attrition bias)
All outcomes
High risk40% of an already small sample lost in follow-up, leaving 32 in the study
Selective reporting (reporting bias)Low riskThis was based on the authors' Masters thesis. Many results reported, including null findings for intervention. But the outcome data were not possible to be used in the subsequent analysis
Other biasLow riskNo implementation problems reported

Michigan D.O.C. 1967

MethodsAssignment using random numbers table, data collectors were blind to assignment
Participants60 juvenile delinquents from 1 Michigan county
Interventions2 tours of a Michigan reformatory
Outcomes6-month follow-up of official petition for delinquency or probation violation
Notes

Brief internal report that does not fully describe nature of intervention

Juvenile home records used in follow-up; data investigators were blind to group allocation

The troubling aspect is the failure to conduct a test for equivalence, particularly with only 60 total persons assigned. Nonetheless, there is nothing else to question the observed findings

Risk of bias
BiasAuthors' judgementSupport for judgement
Random sequence generation (selection bias)Unclear riskRandom numbers tables used to allocate, no test for equivalence reported
Allocation concealment (selection bias)Unclear riskNo description of concealment of allocation
Blinding (performance bias and detection bias)
All outcomes
Unclear riskNo description of monitoring of control group to determine if compensation was an issue. Probably unlikely that the control group received anything else but not specifically addressed
Blinding of participants and personnel (performance bias)
All outcomes
High riskNot possible to blind participants or personnel
Blinding of outcome assessment (detection bias)
All outcomes
Low riskJuvenile home records used in follow-up; data investigators were blind to group allocation
Incomplete outcome data (attrition bias)
All outcomes
Low riskOnly two 2 participants lost
Selective reporting (reporting bias)Low riskGiven the report was done by the Michigan Department of Corrections, and this was their program, it is highly unlikely they would choose to only report one 1 negative finding
Other biasLow riskNo implementation problems reported

Orchowsky 1981

MethodsRandom assignment
Participants80 juvenile delinquents (with minimum 2 offenses), ages 13-20 years, 100% male
InterventionsConfrontational, inmate-run program, locked in cell, introduction by guard, 2-hour session with inmates
Outcomes6-, 9- and 12-month follow-ups of official measures of offending including new court intakes, average number of court intakes, severity of offense
NotesThe massive attrition at 9 and 12 months also corresponds with positive results reported for the program after negative impact at 6 months. However, the tests for equivalence seem to indicate the groups were still comparable
Risk of bias
BiasAuthors' judgementSupport for judgement
Random sequence generation (selection bias)Low riskRandom assignment used, test for equivalence satisfactory
Allocation concealment (selection bias)Unclear riskNo description of concealment
Blinding (performance bias and detection bias)
All outcomes
Unclear riskNo description of monitoring of control group to determine if compensation was an issue. Probably unlikely that the control group received anything else but not specifically addressed
Blinding of participants and personnel (performance bias)
All outcomes
High riskNot done
Blinding of outcome assessment (detection bias)
All outcomes
Unclear riskJuvenile court intake data is were the primary source but no description on how collected
Incomplete outcome data (attrition bias)
All outcomes
Low riskThe study drops 41% at 9 months and 55% at 12 months, principal investigator PIs reports tests for equivalence at 9 and 12 months are satisfactory. We rate this as low risk because at first follow-up, there wa is little attrition
Selective reporting (reporting bias)Low riskNot likely given this is a government evaluation of its own program, and the results at the first follow-up are not positive
Other biasLow riskNo implementation problems reported

Vreeland 1981

MethodsRandomly assigned to 1 of 4 groups
Participants160 juvenile delinquents given probation by Dallas County Court, 100% male, 40% white, ages 15-17 years, averaged 2 or 3 prior offenses
Interventions1-day orientation lasting 13 hours, including haircut and physical labor
Outcomes

6-month follow-up of official (using court records) and self-reported data to establish percentage offending

Attitude toward Law
Friend Survey
Deterrence questionnaire
Self-image
Jesness Checklist

Notes

To remain consistent with other interventions in this review, we took the orientation group comparison with the no-treatment control group. However, the orientation plus counseling group was almost identical to the orientation only group in final results

There is nothing in the report to lead us to question the findings

Risk of bias
BiasAuthors' judgementSupport for judgement
Random sequence generation (selection bias)Low riskRandom assignment used, test for equivalence satisfactory
Allocation concealment (selection bias)Unclear riskNo description provided
Blinding (performance bias and detection bias)
All outcomes
Unclear riskNo description of monitoring of control group to determine if compensation was an issue. Probably unlikely that the control group received anything else but not specifically addressed
Blinding of participants and personnel (performance bias)
All outcomes
High riskNot done
Blinding of outcome assessment (detection bias)
All outcomes
Unclear riskUsed court data and self-report, no other information provided
Incomplete outcome data (attrition bias)
All outcomes
Low riskNo attrition for the two 2 groups (of the 4 in the experiment) reported
Selective reporting (reporting bias)Low riskThis was a doctoral dissertation, and the study includes an array of data and analyses
Other biasLow riskNo implementation problems reported

Yarborough 1979

MethodsResearchers randomly assigned participants according to random numbers table
Participants227 juvenile delinquents under jurisdiction of courts in 4 Michigan counties
InterventionsTour of facility, separated and take to cell for interaction with inmates, confrontational session with inmates, 1 visit 5 hours duration
Outcomes3- and 6-month follow-ups of official juvenile crime as measured by subsequent court petitions, new offenses, average offense rate, weeks to new offense, type of offense charged, average days in detention
Notes

Extensive moderating analyses done

The no-shows and its lack of attention in the report are concerning. Again, nothing in the report suggests anything other than a null or slightly negative effect for JOLT

Risk of bias
BiasAuthors' judgementSupport for judgement
Random sequence generation (selection bias)Low riskResearch unit handled random assignment, good protocol in place, test for equivalence satisfactory
Allocation concealment (selection bias)Unclear riskNot described
Blinding (performance bias and detection bias)
All outcomes
Unclear riskNo description of monitoring of control group to determine if compensation was an issue. Probably unlikely that the control group received anything else but not specifically addressed
Blinding of participants and personnel (performance bias)
All outcomes
High riskNot done
Blinding of outcome assessment (detection bias)
All outcomes
Unclear riskResearchers collected data from court files but unknown if blind to conditions
Incomplete outcome data (attrition bias)
All outcomes
High riskThe study has many no-shows who are dropped from analysis
Selective reporting (reporting bias)Low riskGovernment agency reported a negative result for its own program
Other biasLow riskNo implementation problems reported

Characteristics of excluded studies [ordered by study ID]

StudyReason for exclusion
Ashcraft 1970Used a pre-post test without a control group
Bazemore 2004Used a matched comparison group without randomization
Berry 1985Used a matched comparison group without randomization
Blunkett 2008No randomization, pre-post measures, or appropriate outcomes
Brodsky 1970Used a pre-post design without a control group
Buckner 1983Used a matched comparison group without randomization
Chesney-Lind 1981Used a nonequivalent comparison group design without randomization
Dean 1982Used randomization but did not include any measures of criminal behavior
Feinstein 2005Did not include outcome measures relevant to this review
Gilman 1977Used archival data from 3 sources for post-test only follow-ups without a control group
Langer 1980Used a matched comparison group without randomization
Lloyd 1995Case studies of 3-day visit programs in the UK. No control group is included
Mitchell 1986Used pre-post data without a control group
Muhammed 1999Used post-test data only with no control group
Nelson 1991Used post-test only data without a control group
NSW BoS 1980Used post-test only data without a control group
Nygard 1980Report on process and implementation data only. No follow-up or control group reported
O'Malley 1993Process and implementation data on Australia's Victoria prison program. No control group
Portnoy 1986This study randomly assigned juveniles from high school to watch the Scared Straight video or a more neutral film. It did not involve the actual program. No follow-up data on criminal offenses were reported
Rasmussen 1996Used multivariate regression on county crime rates to estimate prevention impact of program, no control group or randomization employed
Shapiro 1978Used post-test only data without a control group
Storvoll 1998Process and implementation data are reported on Norway's Scared Straight program. No follow-up or control group included
Trotti 1980Used post-test data of reactions of participants, without a control group
Wilson 2010Inappropriate follow-up data, no randomization, no pre-post measures
Windell 2005Descriptive report without adequate evaluative data or methods

Ancillary