Data resource profile: JMDC claims databases sourced from Medical Institutions

Abstract JMDC, Inc. (JMDC) has created a database, using data collected from medical institutions in Japan, consisting of claims (for hospitalization and outpatient treatment), diagnosis procedure combination (DPC) assessment forms, and clinical laboratory test values. The oldest data in this database that can be accessed relate to treatment in April 2014. Currently (the end of October 2019), the number of medical institutions is 218, consisting of 131 DPC‐eligible hospitals and 87 DPC‐ineligible hospitals. Using this database, it is possible to carry out an analysis that makes up for certain limitations of JMDC's another database of data from health insurance societies (eg, the disease status and test results cannot be ascertained, and there is insufficient access to data for elderly people). In addition, it is noteworthy that this database includes not only data from DPC‐eligible hospitals but also data from some DPC‐ineligible hospitals.


| DATA RE SOURCE BA S I C S
Japan has health insurance provided by the social insurance system 1 (referred to below simply as "health insurance"), and when a citizen is examined and/or treated at a medical institution, he/ she presents to that institution a health insurance certificate containing information identifying him/herself and his/her insurance provider, and he/she is then responsible for only a fixed proportion of the medical expenses. The medical institution invoices the insurance provider for the remainder of the sum, via a claims processing and payment organization, on the basis of information in the health insurance certificate. Health insurance is provided by multiple bodies, on the basis of occupation, geography, and age (eg, elderly and geriatric), as follows: 2 • People aged 75 or older can join the medical care system for the late elderly, administered by local governments.
• People less than 75 years old, primarily those who are employees of small, medium-sized, or long-established businesses, and/or are in temporary employment, and their dependents, are covered by health insurance administered by the Japan Health Insurance Association, except for employment-related injury.
• People less than 75 years old, primarily those who are employees of large businesses, and their dependents, are covered by health insurance administered by health insurance societies, except for employment-related injury.
• Public-sector employees and private school teachers/staffs who are less than 75 years old are covered by health insurance administered by the National Public Service Mutual Aid Association and the Promotion and Mutual Aid Corporation for Private Schools of Japan.
• Seamen less than 75 years old, and their dependents, are covered by seamen's insurance administered by the Japan Health Insurance Association, except for employment-related injury.
• People, primarily nonmanual workers, who are less than 75 years old are covered by national health insurance administered by local governments.
JMDC, Inc. (JMDC) has created a database, detailed below, using data collected from medical institutions.
JMDC has established this database using data collected from multiple Japanese medical institutions: 2. JMDC provides services of aggregating and analyzing clinical quality indicators to medical institutions distributed throughout Japan. Some of these institutions agree to provide anonymously processed data which do not include area information to third parties. Data are collected from these medical institutions.
3. Data are collected monthly, and the collected data are added to the database approximately 1 month after the treatment date.
Data are currently still being added to the database. 4. In order to protect personal information, data from medical institutions are collected as information that has been anonymized in accordance with Clause 2:9 of the Law for the Protection of Personal Information, on the basis of a personal ID for each medical institution with which individuals cannot be identified. Research carried out using this database includes the following: 1. Factor analysis of outcome 3-4 2. Patient characteristics [5][6] 3. Validations of indices 7 F I G U R E 1 Numbers of patients whose data can be accessed each month Any information that can identify individuals is anonymized by the medical institution, in such a way that, when the same individual is examined or treated at the same medical institution, he/she can be distinguished on the basis of the anonymized patient ID.
Nonstandardized data are standardized using a dictionary. 4 8 This is done because the terms used differ among medical institutions. For example, "type-2 diabetes" may be entered as "diabetes II," "non-insulin dependent diabetes," "NIDDM," or "adult-onset diabetes." In order to overcome this problem, the JMDC uses a computer-based, retrospective standardization method. Terms are entered from the claims without modification, after which they are standardized using the dictionary developed by JMDC. Using this method enables resolution of problems such as the following, which occur with entry of terms standardized at the entry stage: 1. It is essential to share between personnel the rules and knowledge relating to entry used in the manual procedure, and the training of personnel will, therefore, incur financial costs.
2. If terms are not reassessed once they have been entered, the latest version, based on master data updates, will not be reflected, and errors will be introduced into the time-series analysis.
Claim data and data of DPC assessment forms entered in the database are accorded master data for classification in order to increase the effectiveness and precision of analysis. These master data for classification include the following:

| DATA RE SOURCE US E
The following analyses can be carried out using this database: 1. Detailed information about cancer and other important diseases is recorded on the DPC assessment forms. Therefore, it is possible to carry out an analysis after determining the results for medical care activities and dosage, depending upon disease state and severity, or, alternatively, the results for disease state and severity, depending upon medical care activities and dosage. In addition, using clinical laboratory test value data, it is possible to evaluate the treatment outcome/results based on dosage and medical care activities in such a way that the blood glucose control statuses of patients hospitalized for diabetes can be determined.
2. Unlike JMDC's another database of data from health insurance societies, this database includes elderly people, and it is therefore possible to calculate and analyze the medical expenses and numbers of days with hospital visits or admissions for diseases that frequently affect elderly people.

| S TRENG TH S AND WE AK NE SS E S
This database is structural data in which DPC assessment forms are created based on the electronic medical record that doctors describe from a clinical point of view. It is useful for medical staffs to improve the management, the quality of medical care, and the clinical path, and also useful for researchers to derive new discoveries from case comparisons of diseases, because it is possible to TA B L E 1 Database of data from medical institutions

No information about the population is available.
In order to quantify the above characteristics, comparisons   Figure 2, and the comparison results for each chapter of ICD-10 are shown in Figure 3.
Although the number of beds at clinics is considerably less than at hospitals, it is expected that the number of outpatients at clinics will be quite large. Therefore, scaling up by the relative number of beds is not considered to be appropriate according to the number of outpatients at clinics. Thus, the decomposition of hospitals and clinics in the publicly available total outpatient claims was compared between the publicly available numbers and the JMDC numbers calculated as follows. The JMDC number of outpatients at hospitals is the number of outpatients at hospitals in this database scaled up to the total number of hospitals using the same relative number of hospital beds, and the JMDC number of outpatients at clinics is the rest.
Therefore, the total numbers of both hospitals and clinics constitute publicly available data. The comparison results of the decomposition are shown in Figure 4.

| PROFILE IN A N UTS HELL
• JMDC has created a database from data on medical expenses, etc., collected from medical institutions.
• The earliest data that can be accessed are for April 2014. The number of persons whose data can be accessed for each month tends to increase from the initiation. Currently (the end of October 2019), the number of patients included in the database of data from medical institutions, including those withdrawn at some point, is approximately 9.4 million.
• In principle, all data collected from medical institutions are included in the databases. Nonstandardized data are standardized using a dictionary, and the database permits data to be followed in a time series on the basis of anonymized personal IDs. •

| CON CLUS ION
Using the database of data from medical institutions, it is possible to carry out an analysis that makes up for certain limitations of JMDC's another database of data from health insurance societies (eg, the disease status and test results cannot be ascertained, and there is insufficient access to data for elderly people). In addition, it is noteworthy that the database of data from medical institutions includes F I G U R E 3 Comparison between scaled-up data from medical institutions and publicly available data (numbers of hospitalized for each chapter of ICD-10)

F I G U R E 4
Comparison of the decomposition to hospitals and clinics between data from medical institutions scaled up to the total number of hospitals and publicly available data (numbers of outpatients) not only data from DPC-eligible hospitals but also data from some DPC-ineligible hospitals. Use of this database is on a fee-paying basis. The characteristics mean that it is provided for a wide range of purposes.

| Medical institutions
This refers to all institutions that provide medical care. They are termed "hospitals" if they have 20 beds or more, and "clinics" if they have fewer than 20 beds.

| Claims
Certificates of payment of medical care fees, released by medical institutions, specifying the medical expenses for which health insurance providers (health insurance societies, etc.) are invoiced. They include information about injuries and diseases, medical care activities, drugs, etc.

| ICD-10
The 10th version of the International Statistical Classification of Diseases and Related Health Problems, defined by the WHO.

| ATC
Anatomical Therapeutic Chemical classification systems, defined independently by the WHO and the European Pharmaceutical Market Research Association. Both the ATC systems are included in the JMDC's master data for classification.

| DPC assessment forms
Forms that the MHLW requests to be provided to medical institutions for evaluation of the level of effects, in the context of DPC, which is a medical expense calculation system that determines the daily hospitalization expenses for each diagnostic category. Form

| JLAC10
Version 10 of the Japanese Laboratory Code defined by the Japanese Society of Laboratory Medicine.