Validity of claims‐based algorithms for selected cancers in Japan: Results from the VALIDATE‐J study

Abstract Purpose Real‐world data from large administrative claims databases in Japan have recently become available, but limited evidence exists to support their validity. VALIDATE‐J validated claims‐based algorithms for selected cancers in Japan. Methods VALIDATE‐J was a multicenter, cross‐sectional, retrospective study. Disease‐identifying algorithms were used to identify cancers diagnosed between January or March 2012 and December 2016 using claims data from two hospitals in Japan. Positive predictive values (PPVs), specificity, and sensitivity were calculated for prevalent (regardless of baseline cancer‐free period) and incident (12‐month cancer‐free period; with claims and registry periods in the same month) cases, using hospital cancer registry data as gold standard. Results 22 108 cancers were identified in the hospital claims databases. PPVs (number of registry cases) for prevalent/incident cases were: any malignancy 79.0% (25 934)/73.1% (18 119); colorectal 84.4% (3519)/65.6% (2340); gastric 87.4% (3534)/76.8% (2279); lung 88.1% (2066)/79.9% (1636); breast 86.4% (4959)/59.9% (3185); pancreatic 87.1% (582)/80.4% (508); melanoma 48.7% (46)/42.9% (36); and lymphoma 83.6% (1457)/77.8% (1035). Specificity ranged from 98.3% to 100% (prevalent)/99.5% to 100% (incident); sensitivity ranged from 39.1% to 67.6% (prevalent)/12.5% to 31.4% (incident). PPVs of claims‐based algorithms for several cancers in patients ≥66 years of age were slightly higher than those in a US Medicare population. Conclusions VALIDATE‐J demonstrated high specificity and modest‐to‐moderate sensitivity for claims‐based algorithms of most malignancies using Japanese claims data. Use of claims‐based algorithms will enable identification of patient populations from claims databases, while avoiding direct patient identification. Further research is needed to confirm the generalizability of our results and applicability to specific subgroups of patient populations.

Conclusions: VALIDATE-J demonstrated high specificity and modest-to-moderate sensitivity for claims-based algorithms of most malignancies using Japanese claims data. Use of claims-based algorithms will enable identification of patient populations from claims databases, while avoiding direct patient identification. Further research is needed to confirm the generalizability of our results and applicability to specific subgroups of patient populations.

K E Y W O R D S
cancer, claims-based algorithms, Japan, positive predictive value, sensitivity, specificity, validation

| INTRODUCTION
The incidence of many primary cancers is increasing in Japan, with colorectal, lung, and gastric cancers as the most frequently reported cancer types in 2018. 1 Cancer remains a leading cause of mortality in Japan. 2 Real-world data are essential for understanding cancer epidemiology, both in the general population and as part of drug safety surveillance. Administrative healthcare claims databases provide a valuable source of longitudinal, real-world data for pharmacoepidemiology, comparative effectiveness research, and health services/outcome research. Claims-based definitions for cancers have been developed and validated in the USA and EU for the identification of incident cases of breast, lung, gastric, colorectal, and hematologic cancers in hospital or commercial administrative databases [3][4][5][6] and, in the USA, for determining the incidence of cancer among patients with inflammatory diseases receiving tumor necrosis factor inhibitors. 7,8 Multiple claims databases, such as the National Database of Health Insurance Claims and Specific Health Checkups of Japan, the Japan Medical Data Center, and the Medical Data Vision, are now available to academic and industry researchers in Japan. The Pharmaceuticals and Medical Devices Agency encourages the use of claims-based pharmacoepidemiology research for drug safety surveillance in Japan, and requires validation studies to support the credibility of claims database research for post-marketing surveillance. 9 However, validation studies of Japanese claims data are still limited, as evidenced by a review of claims-based validation studies in the Asia-Pacific region 10,11 and other published studies. [12][13][14][15] Given the unique features of claims data and the clinical practice environment in Japan, validated claims-based algorithms developed in other regions (eg, USA or Europe) are unlikely to be relevant to claims database research in Japan.
The Validity of Algorithms in Large Databases: Infectious Diseases, Rheumatoid Arthritis, and Tumor Evaluation in Japan (VALIDATE-J) study investigated the validity of several predefined disease-identifying algorithms using hospital claims data in Japan.
Here, we report the positive predictive value (PPV), specificity, and sensitivity of claims-based algorithms for selected malignancies (any

Key Points
• Real-world data from healthcare claims databases are available in Japan, but few validation studies exist to support the validity of these data.
• In VALIDATE-J, a multicenter validation study in Japan, algorithms were developed from institutional claims data and validated against hospital-based cancer registry data.
• Positive predictive value (PPV), specificity, and sensitivity were computed for incident and prevalent cancers; PPVs for several cancers were higher than those reported in the USA.
• One of the first and largest validation studies in Japan, VALIDATE-J will inform future claims-based research and serve as a model for validation studies for claims-based post-marketing studies in Japan. malignancy, colorectal, gastric, lung, breast, pancreatic, melanoma, and lymphoma) from VALIDATE-J.

| Study design
VALIDATE-J was a cross-sectional, retrospective study of claims data, medical records, and cancer registry data from two general acute-care hospitals in Chiba, Japana large 917-bed private teaching and cancer care hospital designated by the national government, located in a rural area (Hospital A), and a large 716-bed community teaching and cancer care hospital designated by the local government, located in a city area (Hospital B)conducted between December 2017 and February 2019. The overall study objectives were to validate claimsbased algorithms for rheumatoid arthritis (RA), infectious diseases, and malignancies. Data for RA and infectious disease will be reported elsewhere. An overview of the study is shown in Figure 1.
Prior to study initiation, claims-based algorithms for any malignancy, colorectal, gastric, lung, breast, and pancreatic cancers, melanoma, and lymphoma, were developed and modified by a steering committee of experts in oncology, Japanese cancer registries, and epidemiology ( Table 1). The algorithms were based, in part, on previously tested definitions, 6 Table S1.

An Independent Ethics Committee and the Institutional Review
Board at each participating hospital approved the study protocol. The study was conducted in accordance with accepted practices for pharmacoepidemiology studies issued by the International Society for Pharmacoepidemiology and the Council for International Organizations of Medical Sciences. Patients identified in the claims databases were not required to provide consent and could opt out from participating in the study.

| Study cohort
The claims-based cohort included patients treated as outpatients or inpatients at either hospital between January 01, 2012 (Hospital A) or March 01, 2012 (Hospital B) and December 31, 2016, who met the prespecified claims-based criteria for each selected malignancy type (Table 1). Cases were defined as "prevalent", that is, regardless of baseline-free cancer period, or "incident", that is, cases with a 12-month cancer-free period prior to case ascertainment (primary algorithm of incident cases). The ICD-10 diagnosis codes (Table S1)  performed, the results of validity measures in a subset of patients ≥66 years of age were compared to those in a validation study of US patients ≥65 years of age identified from the Medicare/Pennsylvania Assistance Contract for the Elderly program data linked with the state cancer registry (1997)(1998)(1999)(2000). 6 State cancer registry data were used as gold standard for validation of the US data.

| Validity measures
PPV, specificity, and sensitivity of claims-based algorithms for each malignancy type were calculated using the registry-based gold-standard diagnosis, based on prevalent and incident cases, with claim and registry periods within the same month.

| Statistical analysis
Demographics and disease characteristics were summarized using descriptive statistics (means and standard deviations for continuous variables, and percentages and counts for dichotomous variables).
PPV for each claims-based algorithm was calculated as the number of cases meeting the claims-based algorithm that were confirmed in the cancer registry (ie, true positives) divided by the total number of cases meeting the claims-based algorithm (ie, true and false positives) (Table S2).
Specificity was calculated as the number of cases that did not meet the claims-based algorithm and were not found in the registry (ie, true negatives) divided by the subset of cases from both hospitals that were not in the linked cancer registry, regardless of whether they met the claimsbased algorithms (ie, true negative and false positive cases) (Table S2). Sensitivity was calculated as the number of true positive cases divided by the total number of confirmed cases in the linked hospital cancer registry (ie, true positive and false negative cases) ( Table S2). As a sensitivity analysis, incident cases were also measured with a 6-month cancer-free period prior to case ascertainment (in addition to a 12-month cancer-free period).
Calculations of specificity and sensitivity were based on the following assumptions: case identification in the cancer registry was close to 100%, and all the data from the hospital cancer registries could be linked to hospital claims data. 95% confidence intervals (CI) for PPV, specificity, and sensitivity were calculated using the normal approximation of the binomial distribution. Deidentified data were analyzed using Python version 3.6.0 (2016).

| Patients
During the ascertainment period (2012-2016), a total of 25 934 cases of malignancies specified in this study were recorded in the hospital cancer registries.
Demographics and disease characteristics for cases identified in the hospital cancer registries are shown in Table 2 and Table S3 (data for individual hospitals are shown in Tables S4, S5). Mean age was 64.8 years for patients with any malignancy and ≥65 years for colorectal, gastric, lung, and pancreatic cancers, melanoma and lymphoma, except for breast cancer (mean age was 55.6 years) for which 65% of patients were 40-64 years of age. Approximately half of the malignancies were in females, except for breast cancers which were all in females for this analysis, and colorectal, gastric, and lung cancers for which most cases were in males.
Histology from biopsy or resection specimen was the most common diagnostic method for all cancer types and was used for >92% of all malignancies. In situ cancer was the most common type of breast cancer (21%) identified, and localized cancer was most common among gastric cancer, accounting for 66% of cases. A numerically higher proportion of patients with lung and pancreatic cancers, or lymphoma had Stage IV disease compared with other cancer types. Approximately one-third of all cases (any malignancy) had been treated with surgery or chemotherapy.

| Claims-based cases
A total of 22 108 prevalent cases of malignancies were identified in the hospital claims databases using the prespecified claims-based algorithms A list of the ICD-10 diagnosis codes used is provided in Table S1. d ICD-O-3 codes were converted to ICD-10 codes according to IACR CanReg Tools v2. 16 T A B L E 2 Demographics and disease characteristics of cases from the cancer registries at two hospitals  (Table 3). For both claimsbased and registry cases, the most common malignancy was breast cancer (claims-based, n = 3880; registry-based, n = 4959) and the least common was melanoma (claims-based, n = 37; registry-based, n = 46).

| Prevalent cases
PPV for prevalent cases was nearly 80% for any malignancy, and was lowest for melanoma and highest for lung cancer (Table 3). Specificity was 98% for any malignancy and nearly 100% for all selected malignancies. Sensitivity was lower than specificity for any malignancy; it was lowest for melanoma and highest for breast cancer (Table 3).

| Incident cases
For incident cases with a 12-month cancer-free period and with claims and registry periods within the same month, PPV for any malignancy was nearly 75%, and was lowest for melanoma and highest for pancreatic cancer (Table 4). Specificity was nearly 100% for any malignancy and for all selected malignancies (Table 4). Sensitivity was substantially lower and was <25% for any malignancy (lowest for breast cancer and highest for lymphoma) ( Table 4). Sensitivity analyses using a 6-month cancerfree period vs a 12-month cancer-free period produced similar validity measures (data not shown).
An alternative algorithm for "any malignancy" requiring two cancer diagnoses with the same first three digits of the ICD-codes, within the same claim-month or ±1 claim-month, was tested. This algorithm performed sub-optimally in terms of PPV, specificity, and sensitivity, compared with the primary algorithm for "any malignancy" (Table S6).
T A B L E 3 PPV, specificity, and sensitivity of the claims-based algorithms for selected malignancies vs gold standard cancer diagnosis (both hospitals; prevalent cases)
In contrast, PPVs for claims-based algorithms of colorectal and breast cancers were higher for US cases compared with the claims-based cases in Japan (Table S11). The sensitivity measures of the US claimsbased definitions were consistently higher than the Japanese claimsbased algorithms (Table S11). Comparisons of data from the individual hospitals in Japan with data for US claims-based definitions are shown in Tables S12 and S13. PPV, specificity, and sensitivity were generally high for prevalent cases, and are acceptable algorithms for use in a claims-based study.

| DISCUSSION
PPV and sensitivity were somewhat lower for incident cases with a 12-month cancer-free period prior to case ascertainment than for prevalent cases. This may be because the claims-based algorithms required treatment, and patients were not followed up if they were referred to other centers; patients who were diagnosed, but not treated, at the study hospitals may have been referred to another hospital for more specialized treatment and therefore missed. PPV and sensitivity were lower for melanoma than for other tumor types, due to the very low prevalence of melanoma in Japan, and therefore the claims-based algorithms for melanoma used here are not suitable for use in further studies. PPV, specificity, and sensitivity measures for all incident cases of colorectal, gastric, lung, and breast cancers, and lymphoma, were similar to validity measures for these same cancers in older patients (≥66 years).
Compared with data from a US validation study using US Medicare claims data, 6 PPVs for gastric and lung cancers, and lymphoma, were higher using Japanese claims-based algorithms, while PPVs for colorectal and breast cancers were higher using US claims-based definitions. Variations in the clinical staging of these cancers between datasets may account for some of these differences. For example, the incidence of Stage IV breast cancer in the Japanese data was higher than expected. For gastric cancers, it is likely that the tumor type affected the validation measures. Differences between countries in the way in which population-based cancer screening is conducted should also be considered; in Japan, cancer screening is part of the annual health check (which may include chest X-ray and esophagogastroduodenoscopy) that is covered under universal health insurance. Such an approach may increase the number of false positive results, and this may decrease specificity while maintaining sensitivity. Sensitivity was consistently lower when using the Japanese claims-based algorithms, and this may relate, in part, to the cancer screening system adopted in Japan.
In one of the first validation studies of claims-based definitions in validation studies, such as this study, are steadily increasing and will expand our knowledge about the validity of Japanese claims data for research purposes.
In conclusion, VALIDATE-J demonstrates that validation of disease-identifying algorithms for malignancies created specifically for the Japanese clinical practice environment and unique to Japanese claims data is feasible and has high specificity when applied to data from Japanese administrative databases. Data from VALIDATE-J on disease-identifying algorithms for RA and infectious diseases will provide additional information on the utility of claims-based algorithms with Japanese databases. Studies such as VALIDATE-J will provide researchers with much-needed knowledge about the validity of Japanese claims data, and may serve as a model for future validation studies in situations where direct identification of patients from administrative healthcare databases is not possible. As with other geographic regions where claims database research is conducted, validation will continue to be a crucial activity to support the integrity of claims database research in Japan. were not required to provide consent and could opt out from participating in the study.