Improving Scientific Judgments in Law and Government: A Field Experiment of Patent Peer Review

Many have advocated for the expansion of peer review to improve scientific judgments in law and public policy. One such test case is the patent examination process, with numerous commentators arguing that scientific peer review can solve informational deficits in patent determinations. We present results from a novel randomized field experiment, carried out over the course of three years, in which 336 prominent scientific experts agreed to provide input on U.S. patent applications. Their input was edited for compliance with submission requirements and submitted to the U.S. Patent and Trademark Office (USPTO) by our research team. We show that the intervention caused examiners to (i) increase search efforts and citations to the non-patent (scientific) literature and (ii) grant the application at lower rates in the first instance. However, results were substantially weaker and resource costs substantially higher than anticipated in the literature, highlighting significant challenges and questions of institutional design in bringing scientific expertise into law and government.


I. Introduction
One of the principal rationales for government agencies is expertise. Much of that expertise is scientific. Agencies such as the National Institutes of Health (NIH) and the National Science Foundation (NSF) rely critically on peer review to allocate scientific grants. Scholars, commentators, and policymakers have also advocated for greater reliance on peer review in other regulatory domains (Noah, 2000;Ruhl and Salzman, 2006;Shapiro and Guston, 2006), such as food safety (Kessler, 1984), environmental protection (National Research Council, 2000), education (National Research Council, 2004b), and performance measurement (Kostoff, 1997). In turn, government agencies have increasingly relied on scientific peer review (Guston, 2003) and are required to subject "influential scientific information" to peer review prior to publication (Office of Management and Budget, 2005). Others have challenged the desirability of such expansion, citing potential costs in delay, weakened public participation, cherry-picking of reviewers, and crowding out of normative (as opposed to scientific) judgments (Doremus, 2007;Grimmer, 2005;Fein, 2011;Virelli III, 2009;Wymyslo, 2009). Whether peer review functions as intended even within scientific domains remains empirically unclear (Bornmann, 2011;Cole et al., 1981;Jefferson et al., 2002;Li and Agha, 2015;Smith, 2006;Rennie, 2016;Bohannon, 2013).
One area of significant contestation lies in the patent examination system. This system determines which innovations receive the legal benefits of a patent and is an archetype for government scientific gatekeeping. The five largest patent offices worldwide-in the United States, Europe, China, Japan, and Korea-collectively employ over 27,000 patent examiners who are tasked with evaluating over 2.7 million patent applications filed each year (European Patent Office et al., 2018).
These decisions can have massive implications for science, innovation, and the economy. The U.S. Patent and Trademark Office (USPTO), for instance, estimates that IP-intensive industries added $6.6 trillion to U.S. GDP in 2014 (U.S. Patent & Trademark Office, 2016).
The patent system is also seen to suffer from significant informational challenges. As we doc-ument below, examiners have limited experience, time, and search capacity. Many commentators have hence advocated for peer review to improve patent examination (Noveck, 2006;Graf, 2007;Kao, 2007;Biagioli, 2007;Fromer, 2009;Ouellette, 2012;Atal and Bar, 2014;Ouellette, 2016). Yet to date, no rigorous test of peer review has been conducted for this or any other system of informal adjudication. 1 Our study fills this gap by providing rigorous causal evidence on the effect of expert patent peer review by external scientific experts. We designed an unexpectedly resource-intensive, three-year-long field experiment in which top scientific experts provided input on randomly selected pending U.S. patent applications. Our results show that peer review increased examiner search efforts and citations to non-patent literature and reduced the propensity to initially grant the application. That said, the results were surprisingly weaker than the literature has suggested and highlight profound challenges in bringing scientific expertise into legal institutions. In particular, a significant time investment was required from our research team to translate the experts' input into a form that was compliant with USPTO requirements and could be used by patent examiners. As we spell out, our results have considerable implications for innovation policy specifically and the expanded use of peer review in government more generally, where the evidence base is exceptionally thin (Ho, 2017;Ho and Sherman, 2017).

II. Institutional Setting
We first provide details on the institutional setting of the USPTO. These institutional constraints explain why so many commentators have argued that external peer review can address core problems of patent quality.
The USPTO employs over 8,000 patent examiners. Their principal responsibility is to determine whether each legal "claim" in an application is novel and nonobvious in light of earlier publications ("prior art"), and whether the application discloses sufficient information about how to make and use the claimed invention. The burden is on the patent examiner to identify a proper legal basis for rejecting a patent claim; otherwise, it must be allowed. If the examiner does reject a claim, the applicant can respond (over an indefinite number of rounds) with either legal arguments or amendments to the claim.
There are several reasons to believe that scientific input may benefit patent determinations.
First, many examiners have little experience in the technical fields they examine (National Research Council, 2004a). Only a bachelor's degree in science or engineering is required, even though applications present innovations at the forefront of scientific fields. 2 Due to high attrition, most examiners at the USPTO have been there for less than four years (Lemley and Sampat, 2012).
Second, it is well known that patent examiners are less adept at drawing on non-patent scientific literature (Lemley and Sampat, 2012), despite this literature constituting the primary basis for reporting scientific findings. Third, examiners have limited time to review applications. On average, an examiner has nineteen hours to review an application, research prior art, and write rejections and responses to the applicant's arguments (Frakes and Wasserman, 2017). Applications must be granted if examiners cannot identify a proper basis for rejection within this time window.
Due to these constraints, patent examination faces significant quality-control problems, particularly with improperly granted patents (National Research Council, 2004a;Frakes and Wasserman, 2017). As an indicator of this quality problem, the likelihood that a patent will be granted depends heavily on the (quasi-randomly assigned) examiner (Sampat and Williams, 2019).
These institutional constraints explain why rounds of scholars have argued for peer review in patent examination (Noveck, 2006;Graf, 2007;Kao, 2007;Biagioli, 2007;Fromer, 2009;Ouellette, 2012;Atal and Bar, 2014;Ouellette, 2016). Just as reviewers for scientific journals can help editors by identifying prior publications that undermine the asserted novelty of a manuscript, external scientific experts may be able to help patent examiners by identifying the most relevant prior art, 2 Examiners' education levels vary across technologies; for example, examiners for biotechnology and organic chemistry-related inventions are more likely to hold master's and doctoral degrees (Vishnubhakat and Rai, 2015). Our experiment is not sufficiently powered to examine the effects of peer review in different technology classes, but this is an important question for future work. leading to greater accuracy and consistency in examination outcomes. External reviewers may be particularly well suited for identifying non-patent prior art. 3 On this theory, the USPTO piloted a "Peer to Patent" program in 2007-09, which allowed applicants to opt in to a system for crowdsourcing prior art (Noveck, 2006;Allen et al., 2012). The pilot generated an average of 2.7 prior art submissions for each of the 226 eligible applications, and 38 of these applications were rejected based on submitted references (Allen et al., 2012). While supplying an important proof of concept, the pilot provides only limited evidence of the effectiveness of peer review. First, submissions were allowed for all participating applicants, with no comparison group. Absent the crowdsourced submissions, for instance, examiners might still have found a basis to reject for the same number of applications. Second, the pilot focused more on crowdsourcing as opposed to expert assessments through more conventional scientific peer review. Third, because patent applicants and reviewers elected whether or not to participate, it is unclear how the intervention would scale to a more representative set of applications and reviewers (Doremus, 2007;Wymyslo, 2009).
What is hence missing in the literature is a rigorous evaluation of the feasibility, benefits, and costs of peer review.

III. A Novel Field Experiment
After early explorations of a joint pilot with the USPTO stalled, we designed a novel field experiment that allowed us to test the effects of external peer review without requiring direct USPTO participation. The key insight is that the America Invents Act of 2011 (AIA) liberalized a statutory provision for third-party submissions to pending patent applications. Most patent applications are published eighteen months after filing. Within six months of an application's publication, any third party may submit relevant prior art along with a concise description of 3 Note that external peer review, analogous to review at a scientific journal, should be distinguished from internal peer review through teamed examination (Ho, 2017), supervisorial review (Ho and Sherman, 2017), or informal peer effects (Frakes and Wasserman, 2015). The European Patent Office already involves significant collaboration and peer review in the examiner training process (Lahorte, 2018). relevance for consideration by patent examiners (35 U.S.C. § 122(e)). The submission may not include legal conclusions, such as whether a claim is obvious or not novel. Based on retrospective analysis of applications from 2012-16, third-party submissions were made for fewer than 0.1% of eligible applications and appear to usually be made by parties with competing interests. Survey evidence from patent examiners suggests that these submissions can be useful and time-saving (Kapelner et al., 2013). We hence leveraged this statutory provision to design a field experiment in peer review. Our intervention amounted to running a pro bono scientific journal customized to the USPTO docket, along with a randomized control group of (paired) patents not subjected to peer review.

A. Experimental Design
The five stages of our randomized controlled trial (RCT) are summarized in Figure 1. To pilot the experiment, we recruited 25 experts known to the authors, of whom 13 agreed to participate and were matched with patent applications, and 10 successfully completed their review. Interactions with the pilot experts were used to develop the frequently asked questions and the review submission form described in the full protocol below.
Stage 1: Expert Identification. We sought to identify experts who would have substantial knowledge of the patenting process and also be acculturated to academic peer review. To do so, we used USPTO data to identify patent inventors with academic affiliations. We started with data on assignees of patents from 1976 to 2016, as academics are typically required to assign patents to universities. From the set of top 5,000 assignees, we manually tagged all universities. We then identified all inventors with a patent indicating a university affiliation. To balance representation across fields, we then identified experts with the most patents in six technical fields, as categorized by the National Bureau of Economic Research (NBER): drugs and medical, chemical, electrical and electronics, computers and communications, mechanical, and other types of patents. Our initial list of experts hence represents the 300 inventors with the highest number of university-assigned Excluded experts for whom no email was found (n=110) Identified experts who were in the top 300 university patenters in the six NBER patent categories for the last 10 years (n=1800)

Recruitment
Experts did not respond to solicitation or declined to participate (n=1140) Experts contacted by personal email explaining the purpose of research and asking if they would be willing to participate (n=1476)

Matching
Experts who agreed to participate were matched to two patent applications related to their area of expertise, and randomized to review one application (n=336) Some experts rematched when requested or when application had already been matched to another expert (n=54)

Gathering Responses
Qualtrics form sent to experts with assigned patent application to submit prior art for third-party submission (n=336) Some experts did not respond to Qualtrics form despite follow up emails (n=190) Experts submitted form with between 1 and 3 pieces of prior art, as well as description of relevance of prior art (n=146)

Submission
Submissions rewritten into claim charts that would be accepted by USPTO (n=146) Some submissions not sent to USPTO because received too late or not high enough quality, or because expert found no relevant prior art (n=14) Submission sent to USPTO but not accepted (n=1) Third-party submissions sent to USPTO and accepted (n=131) patents within each NBER category in the past ten years. The threshold of patents to be included were: 13 in drugs and medical, 9 in chemical, 9 in electrical and electronics, 7 in computers and communications, 3 in mechanical, and 2 in the residual "other" category.
After we identified these experts, we conducted online searches to find contact information for each expert, including their current organization, email address, website, and title (Prof., Dr., etc.). We excluded experts for whom we were unable to locate contact information or who were determined to be emeritus faculty, deceased, or to no longer have a research affiliation.
Stage 2: Recruitment. From June 2016 through July 2017, we personally solicited experts by email to inquire about their willingness and ability to participate, along with answers to frequently asked questions about our study ( Figure 4). For the full trial, we contacted 1,451 experts from the population identified in stage one, and 323 experts agreed to participate. When added to the 13 pilot experts, this resulted in 336 experts who were matched with patent applications in the next stage. The overall participation rate of matched experts out of recruited experts was thus 23%.
Stage 3: Matching. Once experts opted to participate, we constructed a search process to match two patent applications to each expert, enabling us to randomly assign one patent for peer review.
We assembled a research team that included the principal investigators, patent agents, a former patent examiner, and patent law students, with technical backgrounds covering each of the NBER categories. Such substantive coverage enabled us to engage in a fine-grained search to match applications to specific interest by experts.
We provided our research team with each expert's website, publications, patents, and any keywords supplied to us by the expert. Using this information, the team members engaged in extensive searching of patent applications to match each expert with two recently published patent applications in their specific field. We used a Qualtrics survey form to guide the matching process within our research team ( Figure 5). Given the large number of pending patent applications, it was usually possible to find applications very closely related to the areas of expertise for the vast majority of experts. To provide an example, one expert had a prior patent for a piezoelectric energy harvester, and the two pending applications matched to that expert were for (1) an "energy harvester" that "may be a piezoelectric energy harvester" and (2) "piezoelectric energy harvesting array." When it was not possible to find proximate matches, we conducted the search again after a week to enable new patents applications to be published.
The main constraints on the matching process were timing, proximity of the match, and expert overlap. First, third-party submissions cannot be made more than six months after an application is published or after it receives its first substantive office action either rejecting or allowing any claims.
Research team members were thus instructed to look for applications that had been published within the prior three months (to allow for a sufficiently large pool of applications) and were ready for examination but had not yet had a substantive office action. A primary claim was identified for the expert to address, which was by default claim 1 or the first non-cancelled claim. In a few cases a different claim was identified as the primary claim for the expert to address because it seemed more closely related to their area of expertise. Second, the principal investigators reviewed each set of matches. In a number of instances, team members were asked to refine their search process to provide a more proximate match for the expert. A small number of experts requested a different application that was more closely related to their expertise. Third, patent applications needed to be unique among all experts so that one application could not be randomized to receive treatment from two different experts. Some patent applications were matched to two different experts, which required rematching.
Due to these constraints, the matching process turned out to be time consuming, taking an average of about two hours per expert for team members.
Stage 4: Gathering Responses. After matching each expert to two unique patent applications, we randomly selected one application to be sent to the expert for review. We emailed experts with the assigned patent application, instructions on searching for prior art (including answers to additional frequently asked questions), and an online web form for submitting the peer review.
Sample instructions and web form are displayed in Figures 6-8. We instructed experts to identify up to three pieces of relevant prior art (the limit for a free submission). The review form included prompts for experts to explain why their prior art was relevant to each claim element as well as questions on whether they felt the patent application should be granted and how many hours they spent on their submission. After multiple personal reminders, 146 experts completed these reviews. Due to nonrandom noncompliance, we focus on "intention-to-treat" effects among the 336 pairs of applications for patent dispositions. Due to the cost of manually reading through patent examination records, our results on non-patent literature search and citation rates below compare pairs of applications for which a successful submission was made.
Stage 5: Submission. As displayed in Figure 8, our web form was designed to guide experts to assess prior art in a fashion that would be compliant with USPTO. Despite our efforts and correspondence with experts, many submissions needed to be rewritten and reviewed. Our research team reviewed responses to ensure (a) accuracy of citations, (b) date compliance (i.e., that the date of publication for the prior art was before the priority date of the application), (c) compliance with USPTO rules against legal conclusions of patentability, and (d) an analysis of how the claims corresponded to the prior art. This revision took substantial amounts of time. First, the concise description of relevance submitted by experts often comprised no more than a general assertion of relevance. Second, experts often provided legal conclusions, stating plainly that a claim is obvious or not novel. Third, many experts simply located scientific references, without analyzing their relevance to the claim. In short, most experts either did not understand the USPTO requirements or did not have the time to tailor their submission, in spite of their familiarity with the patenting system.
Even after revisions, some submissions did not have enough substance to submit to the USPTO,4 and some experts failed to submit reviews in time for USPTO consideration. As a result, only 131 4 In some cases, this was because the expert found no relevant prior art. Although this null result seems like a useful signal for the USPTO (and for public confidence in the examination result), there is no legal mechanism for informing the examiner that an expert in the field believes the claims to be patentable. If the USPTO decides to experiment with expert peer review, the agency should consider how to use null search results, along with appropriate safeguards for shirking or gaming. reviews were actually submitted to and accepted by the USPTO, resulting in a compliance rate of 39%.

B. Data Sources
We relied on the USPTO's Patent Examination Data System (PEDS) API to collect information about applications after the third-party submission. PEDS includes information on application data, transaction history, patent term adjustments, and published documents, corresponding to the available tabs on the Public Patent Application Information Retrieval (PAIR) system. The data used for the analyses in this paper was last updated on June 17, 2019. 5 We manually collected information from each of the 336 treatment and control applications (672 applications total), including the number of claims, the number of figures, the number of tables, and the page length of the application. We also collected the publication dates of the prior art submitted by experts by hand.
To determine if examiners cited or conducted searches based on our submitted prior art, or any non-patent literature (NPL), we read the documents associated with the first office action on the Public PAIR website. In particular, we coded up information from the search information form (Form SRFW) and search notes (Form SRNT) that patent examiners complete to document their search for prior art. We determined whether the examiner searched an NPL database (Google Scholar, PubMed, Medline, SciSearch, Embase, Biosis, IEEE Xplore Search, Inspec, Ei Compendex) and whether any documents listed in the search notes came from NPL databases. For treatment applications, we also examined Form SRNT for whether the examiner conducted searches specifically based on the patents or non-patent literature included in the third-party submission. In a few instances, an examiner cited third-party NPL, but did not list any NPL databases in the search strategy. We did not count these in the search rates, although it is possible that search histories are incomplete. In other instances, no Form SRNT and / or SRFW was posted, in which case we treated search outcomes as missing. A form included with the office action (Form 892) lists references cited by the examiner, but because that form is not always complete, we calculated citation rates by directly content coding the text of the examiner's first office action decision.
C. Balance Checks Table 1 provides balance statistics of covariates for all applications matched to experts and for the subset of application pairs where the review was submitted to and accepted by the USPTO.
For all applications matched to experts, there are no significant differences between the treatment and control groups, as expected under randomization. Examining the subset of pairs where a thirdparty submission was successfully made, the sample remains largely balanced, but we do observe some imbalance in this subset.
The mean number of pages in the patent application is lower for the treatment group, which likely occurred because the experts who were assigned a longer application were less likely to submit prior art because of the time it would take to read a lengthy application. To confirm this, the difference becomes statistically insignificant when the data is trimmed to exclude pairs with treatment and control page differences exceeding 100. In addition, the proportion of applications examined under the first-to-file patent requirements established by the AIA or not is significantly different. In investigating this result, we found that the mean number of pages for an application not under AIA was 21.2, while the mean number of pages was 15.0 for applications examined under AIA. The proportion of applications that are continuations is also significantly different between the treatment and control groups when looking at only the applications for which experts submitted prior art. Continuation applications have the priority date of their parent application, so the priority date is before the actual filing date of the application. AIA status is determined by the priority date, where applications with a priority date before March 16, 2013 are not evaluated under the AIA requirements. This difference is hence related to AIA status, as applications that are continuations are less likely to be under AIA requirements. Notes: Comparing the treatment and control groups across the pretreatment variables with all matched patent applications shows that there are no significant differences between the two groups. When comparing the treatment and control groups with only pairs where the third-party submission was returned by the expert and accepted by the USPTO for the treatment group application, some differences are introduced in the proportion of applications examined under AIA rules, the proportion of applications that are continuations, and the mean number of pages. For brevity, we list only the most prevalent technology centers.  Because of noncompliance with treatment assignment, we focus on the intention-to-treat (ITT) effect when outcomes are observable for all units. The ITT test statistic is the mean difference based on intended assignment to treatment or control, regardless of actual submission by experts.

A. Benefits of Patent Peer Review
Where outcomes are only observable for submissions, we focus on the effect within the sample where submissions where accepted. While this means that there are slight imbalances in this sample (see Table 1), those imbalances should cut, if anything, against an increased non-patent citation and search rate, as the longer page count should provide more potential references to investigate.
These results demonstrate that expert submissions-as translated by our research team-caused examiners to increase search efforts and citations to non-patent literature, which has been seen as a blind spot in the examination process (Lemley and Sampat, 2012). While examiners cited non-patent literature in 23% of control applications, that citation rate increased to 37% in the treatment applications (p = 0.01). The search records reveal that examiners conducted searches based on the specific prior art our experts submitted in one-third of cases. Figure 2 also plots results from randomization inference, a nonparametric test that accounts for the pairwise treatment randomization mechanism (Imbens and Rubin, 2015). Observed citation and search differences in vertical lines are in the right tails of the randomization distribution, confirming statistically significant effects: the search and citation of NPL appears higher than would be expected under the sharp null hypothesis of no treatment effects.
Second, to examine whether the third-party submissions might have increased the rate of examiners searching and citing non-patent literature (NPL), we present the respective randomization distributions in the right two panels of Figure 2. For the applications that had a successful thirdparty submission and their corresponding matched applications, we collected data on whether NPL was included in the submission, whether the examiner searched for something related to the submitted prior art, whether the examiner cited any of the submitted prior art in a rejection decision, and whether the examiner searched for or cited NPL at all, even if not from the third-party submission. Figure 2 shows that the observed difference is in the right tails of these randomization distributions: the search and citation rate of NPL appears higher than would be expected under the null.
Our evidence is weaker on the effects on the "first office action" (the first of potentially several rounds of decisions after an application is submitted), which can take years to observe. Prior evidence establishes that examiners are more likely to erroneously grant invalid patents than reject valid ones, with an overall grant rate around 68% (Frakes and Wasserman, 2017). We would hence expect that greater scientific expertise should reduce the grant rate. Of our experts submitting reviews, 80% did not think a patent should issue. We find that examiners granted applications on the first office action for 10% of the treatment group, compared to 14% for the control group.
Although the difference is only borderline statistically significant (p = 0.06), this effect is of substantial magnitude, given that very few patent applications are granted in the first office action.
The left panel of Figure 2 plots the randomization distribution of the difference in proportion Notes: The first substantive office action taken by the examiner could be a notice of allowance (granting the patent), a non-final rejection (allowing the applicant to respond with arguments or claim amendments), or a final rejection (which still allows the applicant to request continued examination). Difference-in-means test results, where applicable, are shown for the first-action allowance rate, the rates at which the examiner listed a search of a non-patent literature (NPL) database or cited any NPL (including without listing a search), and the rates at which the examiner conducted a search based on or cited at least one piece of submitted prior art. The observed difference falls below this randomization distribution. As an additional benchmark, we can compare the observed allowance rate in the treatment group to the rate across all utility patent applications from January 2014 to June 2017. Of 755,666 applications with recorded first office actions, 13.57% had a notice of allowance as the first office action, compared to 10% in the treatment group.
Last, to check whether our inferences are sensitive to covariate imbalance, the bottom row of Figure 2 presents covariate-adjusted randomization inference, where we adjust the test statistic using logistic regression controlling for the number of pages, AIA status, and continuation status.
We find comparable results for the allowance rate (p-value = 0.05), the NPL search rate (p-value = 0.04), and the NPL citation rate (p-value = 0.01).

B. Challenges in Scientific Translation
While our results suggest that external peer review can improve the patent system, the design and administration of our peer review system also illustrate the institutional challenges in bridging science and law. Despite the fact that reviewers (a) were inventors on multiple patents, (b) largely had academic affiliations and hence familiarity with academic peer review, (c) affirmatively opted in to engage in patent peer review, many still found it challenging to interface with the patent system in practice. Our surveys asked all experts to provide feedback on their engagement, which highlighted three common challenges.
First, many experts were unable to complete their review. Over 60% of experts who said they were willing to review an application failed to complete that review successfully, reducing statistical power. Of those who provided an explanation for failure to complete the review, the vast majority reported lack of time given the perceived difficulty of the task. For example, one had not anticipated that patent peer review would turn out to be "even more complex than peer review of manuscripts." This evidence is consistent with the reports from other contexts about the complexity of peer review in regulatory settings (Doremus, 2007).
Second, many experts who did complete reviews reported struggling to translate their expertise into a form useful for the patent examination process. They described the process as "laborious," "annoying," and "quite a lot less satisfying than reviewing a paper." The most challenging aspect was translation between science and the legal jargon of patent claims. Experts complained of the "legalese" that was "difficult to understand" and reported "immense trouble reading and understanding claims." One expert noted having "no idea what [a particular claim term] meant." Despite our guidance, experts often failed to focus on the legally cognizable claim language, frequently assessed only part of a claim, and rarely grappled with more than one claim. Many experts -like the one who "did not consider [dependent claims] since [the] independent claim is so weak" -did not understand that simply having one element that is distinct from the prior art may be sufficient for patentability. Dependent claims may in fact be valid even if the broader independent claim they depend on is rejected. Others described a patent as obvious with descriptions of the patent that seemed drawn from the title and abstract than from the more detailed claim language.
Third, contrary to descriptions in the literature, designing and operating the peer review process required a surprisingly significant time investment. Matching experts with suitable applications took an average of about two hours per expert. Experts reported spending an average of just under three hours on their submissions (mean = 2.93, SD = 1.39) and commented that the process was "a lot more work than reviewing a peer reviewed article" and would "need to be reimbursed at a very high hourly rate." Most time-consuming was the process for our research team to edit and translate the reviews into a form cognizable under patent law. Under a conservative estimate, our intervention involved over 1500 hours -excluding time spent designing the system, training research assistants, and analyzing the results. The relevant policy question is not whether this intervention had a measurable benefit -which it did -but whether that benefit is outweighed by the considerable cost.
To formally illustrate the cost of translating expert reviews, we employed a patent agent and a former patent examiner to score the quality both of the expert's review and of our revised USPTO submission for each of the 131 successful submissions. We used a five-point scale, with a five indicating the highest quality submission and a one indicating the lowest quality submission. Any review with a coding of one was not submitted to the USPTO. Appendix B presents our more detailed coding criteria for quality scoring, and Appendix C provides examples of submissions that received each quality score of five through two. Figure 3 provides the distribution of scores for initial reviews by experts in dark grey and the revised submissions after review and editing by our research team in light grey (conditional on submission). Roughly 24% of expert reviews merited the highest two quality ratings because they either "very clearly addressed all elements in at least one claim and at least some elements of other claims" or "drew many connections between claims and reference(s), but it is less obvious how or there are claim elements not present in the reference(s)." In contrast, 76% of expert reviews were considered only "modestly" or "slightly" relevant. After revision by our research team for submission to the USPTO, the percentage of reviews meriting the highest two quality ratings rose from 24% to 60%. This ex post quality assessment illustrates (a) the large gap between expert reviews and what is demanded by the patent process, and (b) the substantial value added by our research team's resource-intensive revision process. Yet even with these revisions, examiners ultimately both searched and cited the prior art offered for only 11% of treatment applications for which we submitted prior art.
In short, while our intervention provided benefits to the patent process, our results also suggest important considerations in the sustainability of peer review in law and government.

V. Discussion
These results underscore the longstanding problem of translating scientific expertise into law and policy (Snow, 1959;Jasanoff, 1998). Based on this experience with patent peer review, we offer several implications for the patent system specifically and peer review in government more generally.
First, greater infusion of scientific expertise may indeed improve patent examination, enabling examiners to draw more reliably on non-patent literature and to identify grounds for rejection.
Our intervention increased the rate of citation to non-patent literature by over sixty percent and decreased the rate of first-action allowance by nearly thirty percent. In that sense, our findings corroborate the benefits many have speculated about for the role of peer review in government (Noah, 2000;Ruhl and Salzman, 2006;Shapiro and Guston, 2006). 7 But if the USPTO or other agencies are interested in piloting this intervention on a larger scale, they will need to find ways to reduce costs involved and allow for more effective engagement by non-lawyers. For instance, the statement of relevance may be too difficult for non-patent experts to complete. And the third-party submission system does not allow for input unrelated to prior art, including the adequacy of the disclosure, where expert input may be particularly valuable (Ouellette, 2012(Ouellette, , 2016. Additionally, incentivizing external reviewers would be a key challenge, as illustrated by our high drop-out rate and comments from some reviewers indicating that they would want a high hourly rate. 8 Other patent-system actors may ultimately prove more effective at bridging the gap between science and law. 9 Second, external scientific input may shed new light on the patent quality debate. The existing literature often looks to grant rates of parallel patent applications in foreign patent offices as a quality metric (de Rassenfosse et al., 2019;Frakes and Wasserman, 2017), although variation in claim scope makes interpretation difficult. The fact that eighty percent of our experts submitting reviews did not think a patent should issue may suggest that the USPTO's current patentability standards are laxer than the scientific community's. That said, given the limited understanding by 7 Our intervention may have had additional benefits beyond those we measured. For example, if examiners rely on our submitted prior art when evaluating other patent applications related to similar technologies, then this spillover would not be captured by our outcome variables. 8 We are unsure whether external reviewers would be more or less motivated to assist with an official USPTO patent peer review program than with our academic experiment. On the one hand, a government-sanctioned program might seem more important and prestigious; on the other hand, reviewers may have been more likely to agree to a peer review request from fellow academics.
9 For example, future work might explore reviews by patent agents with advanced science degrees-such as those on our research team-or university technology transfer administrators with science backgrounds. An alternative institutional design would be to create an expert scientific review panel employed full-time within the USPTO. The greater legal knowledge of these reviewers would come at the cost of less specialized and cutting-edge scientific expertise, but may be more effective overall as panelists could learn over time.
experts of claims, we caution against placing too much weight on this evidence. Third, our findings corroborate recommendations that the USPTO build out better search systems for non-patent literature (Government Accountability Office, 2016). Innovation often occurs outside the patent system, so the ability to locate novel scientific findings that are disclosed in scientific journal articles and other non-patent publications is critical for making an accurate determination of novelty. This challenge of unrepresentative inputs into agency processes extends beyond the USPTO's informal adjudication to the rulemaking process generally (Farina and Newhart, 2013;Yackee and Yackee, 2006).
Fourth, critics of expanded reliance of peer review in government may be well founded in focusing on cost-effectiveness. External peer review may improve patent examination, but it may not be more cost-effective than increased examination time (Frakes and Wasserman, 2017) or alternative peer review designs such as internal peer review (Ho, 2017). Artificial intelligence may also decrease prior art search costs (Engstrom et al., 2020;Helmers et al., 2019). 10 The resources required to match and to adapt referee reports (Figure 3) demonstrate the cost and the need to have internal institutions (akin to a strong editorial board) to make peer review work. Even though our intervention had a substantial effect on rates of searching and citing non-patent literature and the likelihood of initially granting the application, policymakers considering peer review need to weigh the benefit against the substantial time expended by experts and resources required to manage such a system.
Finally, because of the dearth of robust empirical insights into peer review (Ho and Sherman, 2017), more interventions need to be constructed to facilitate rigorous evaluation (Abramowicz et al., 2010;Chien, 2018;Greenstone, 2009;Ouellette, 2015). Our study illustrates how to conduct such a rigorous pilot, but much more work is required to provide a solid evidence base for whether and how to expand peer review in the public sector-including to test the ideas generated by our results. We hope that leading patent offices and government agencies will increasingly partner with academic teams to pilot, design, and evaluate these promising interventions.
In sum, our results demonstrate that peer review does indeed appear to help the informal adjudication of the patent system, providing a critical real-world confirmation of what has so far been a plausible but unsupported claim. Our study thus critically reframes the debate on patent examination and informs the broader debate about institutions to improve scientific judgment in government. Thanks again for being willing to review a patent application! We've spent time with a team of graduate and law students to identify a suitable application based on your research background, which I'm linking to in two formats so you can choose the one you prefer. The part that matters legally is the numbered "claims," so you should focus on them in deciding whether you can provide feedback on this application.

Downloadable PDF from USPTO (claims at end): [URL]
Google Patents version (claims in righthand column): [URL] This application has a "priority date" of [DATE], which means that in general, only documents from before this date are relevant for evaluating which claims are patentable. Such documents, known as "prior art," include anything that was available to the relevant public, even if only briefly (like a conference poster) and even if quite obscure (like a dissertation in one library).
Can you identify up to three pieces of prior art that you think are most closely related to the invention? It is most useful to examiners if you can explain how particular elements in one of the numbered claims are described on specific pages of your reference, and we've created a review form to help you do this:

Review submission form: [URL for Qualtrics Review Submission Form]
At core, this is similar to peer review for scientific journals, where you might identify prior publications that undermine the novelty of a submission. A. To help patent examiners understand whether a patent application is actually an improvement over what has already been done, we can submit up to three "prior art" references from before the "priority date" (first filing date) of the application, along with a concise description of the relevance of each reference. We cannot submit opinions on whether the claims are patentable. At core, this is similar to a peer review for a scientific journal in which you identify prior publications that undermine the novelty of a submission, while leaving the ultimate decision of whether the submission merits publication up to the editor.

Q. What counts as "prior art"?
A. The definition of prior art is extremely broad, including both documents and actual uses of an invention from anywhere in the world, and including references that are rather private, temporary, or obscure. (Publications by the inventors themselves do not count if within one year of the priority date.) Your input is particularly valuable for identifying non-patent prior art, such as scientific publications, conference proceedings and posters, websites, and live demonstrations. It will help the examiner if you can point out the prior art references that are most closely related to the claimed invention. If you are unsure whether something counts, please ask.

Q. What should be included in the "concise description of relevance"?
A. You should explain how each piece of prior art is related to the invention. It is most useful to patent examiners if you can show how particular elements of a numbered claim (from the end of the patent) are described on specific pages or lines of the prior art reference. To make this easier, we have created a web form that breaks claim 1 of your patent into different elements so that you can specify which elements are present in the prior art you have identified. You are welcome to add additional paragraphs on the other claims.
If you find a prior art reference with all the elements of a claim, the claim will be rejected as not "novel." If your prior art references are similar but not identical to the claim, the claim might also be unpatentable if the invention seems "obvious" to someone in that research field. But we cannot submit your view on whether the claims are novel or nonobvious-only a factual description of each document's relevance.

Q. I don't understand the numbered claims at the end of the patent. Can't I just read the abstract to get a sense of what the invention is?
A. We agree that claims are often confusing legalese, but they are the part of the patent that matters from a legal perspective. The idea is to make it relatively easy (at least in theory) for a Expert Reviewer FAQs 2 patent examiner or someone accused of infringing the patent to determine what the actual elements of the invention are. For example, a claim to a pencil might read as follows: 1. A writing device comprising: (a) a wooden cylinder with a hollow core; (b) said hollow core containing graphite; (c) eraser material attached to one end of said wooden cylinder.
The claim has a preamble ("A writing device"), a transition ("comprising"), and a list of three elements. If a single prior art reference has all three of these elements, the claim is not novel. If one prior art reference has two elements (e.g., a wooden cylinder with a core containing graphite) and another has the third (the eraser material), the examiner may still reject the claim as obvious.
Patents can also claim methods, such as "2. A method of writing comprising the steps of…," which are evaluated the same way. Claims can also be "dependent" on earlier claims, such as "3. The writing device of claim 1, in which the wooden cylinder is at least 30 cm long." Patentees seek claims of varying breadth, so you may think some claims are patentable while others are attempting to claim too much. If you have trouble understanding the claims even after reading the body of the patent, let us know.

Q. What if I think there is not enough information in the patent to make the claimed invention, or that the invention is impossible?
A. If there are significant technical hurdles to making the claimed invention that the application does not explain how to overcome, we may be able to alert the patent examiner to these concerns. For example, if the application claims an algorithm with a general optimized solution to the traveling salesman problem without providing details of the algorithm, we can submit a prior art reference explaining that this is an NP-hard problem that remains one of the most intensively studied problems in optimization. If you are unsure whether your application raises this kind of concern, please ask.

Q. What if I think all the claims in this application should be granted?
A. That's great! Unfortunately, we cannot give the patent examiner your conclusion about the merits of the application; we can only send the prior art references you think are most relevant. It may still be helpful to send the most closely related references so that the examiner can see what a leap this is from what has been done before.

Q. This document didn't answer my questions. What should I do?
A. Email Lisa Ouellette at ouellette@law.stanford.edu. We are trying to make it as easy as possible for experts like you to share your expertise with patent examiners, but this FAQ list is a work in progress-we welcome suggestions for improvement. To the extent the preamble constitutes a claim limitation, Orghidan generally discloses a "catadioptric stereo based on structured light projection." a catadioptric projector configured to produce patterned illumination, wherein the catadioptric projector includes: Orghidan generally discloses catadioptrics used to create patterned illumination, such as the disclosed checkered and dotted patterns. See, e.g., Orghidan, Part 3.4.2. a radiation source; Orghidan discloses patterned illumination catadioptric systems that use light sources, and even specifically devotes a section to light sources and patterned light. See Orghidan, Part 1.3.
a static pattern generating element configured to condition radiation from the radiation source to produce the patterned illumination; and Orghidan discusses using static pattern generating elements to produce patterned illumination, such as the disclosed dotted and checkered patterns. See, e.g., Orghidan, Part 3.4.2 and FIGS. 3.11-3.17.
a convex reflector positioned to project the patterned illumination; and Orghidan discusses the use of convex reflectors (mirrors) for use in catadioptrics. See, e.g., Orghidan, Parts 2.4-2.5.
a camera configured to acquire one or more images based on the patterned illumination.
Orghidan discusses cameras configured to acquire images based on the patterned illumination. See, e.g., Orghidan, Chapter 2 and Chapter 3, especially Part 3.3.1; see also Chapter 4.1 (finding "images acquired with a catadioptric camera appear distorted").

Relevance of Submitted References
A method comprising: accessing anatomical data corresponding to a threedimensional image of a physical structure; The general idea of using virtual or augmented reality for enhanced exploration of MR or CT images (such as for medical training) has been around for years. For example, Gibson et al. from 1997 describes an approach for simulating knee surgery that combines imaging, augmented reality, and haptic feedback, though the haptic feedback is using a commercial desktop force/motion display instead of a glove, and a computer monitor instead of goggles. The 2004 '903 patent mentions haptic glove interaction for medical diagnostic display interaction (labeled as 58 in Fig. 2 and described in the specification). A more recent review with similar content is Coles et al. from 2011.
causing display of the threedimensional image of the physical structure using a wearable visualization device based on the anatomical data; Wearable visualization devices have been used in various efforts to combine imaging and vision; see, e.g., the "head tracked stereoscopic viewing system" on p. 57, col. 2 of Coles et al., or the "3D goggles" in claim 22 of the '903 patent.
monitoring a position of a body member of a user of the wearable visualization device relative to a corresponding location on the physical structure displayed to the user in the three-dimensional image, the position of the body member being associated with a haptic device; All three submitted references describe tracking the position and orientation of the hand of the user, which is necessary to allow the user to use their hand to interact with the augmented reality data (e.g. from MR or CT scans).
accessing density data corresponding to the location on the physical structure; identifying haptic feedback data corresponding to the density data; and Any effort that involves haptic feedback will produce stimuli based on density, stiffness, etc. E.g. Coles et al. p. 58 ("This simulator calculates needle tip resistance using CT density data. . . ").
causing the haptic device to provide haptic feedback corresponding to the location on the physical structure.
The haptic feedback must necessarily correspond with where the user's hand is located virtually with respect to the acquired imaging data, as described in all of these references.