SEARCH

SEARCH BY CITATION

Keywords:

  • transients and migrants;
  • minority groups;
  • epidemiologic methods;
  • sensitivity and specificity;
  • Germany;
  • Turkey

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

Migrants often face particular social, economic and health disadvantages relative to the population of the host country. In order to adapt health services to the needs of migrants, health researchers need to identify differences in risk factor and disease profiles, as well as inequalities concerning treatment and prevention. Registries of health-related events could be employed for these purposes. In Germany, however, routine data bases often hold no, or inaccurate, information on the national origin of the cases registered. We developed an algorithm based on a large data set of Turkish family and first names (n=15 000), with religion as additional criterion, to identify cases of Turkish origin in registries in a largely automatic search. We tested the performance of the algorithm in a population registry and in a cancer registry. The algorithm discriminates well against Greek and Arab names, with 1% false positive matches in our study. It achieves a specificity of > 99.9% in delimiting Turkish from German cases in the cancer registry. The sensitivity can be increased to 85%, provided the small proportion of case records with uncertain origin can be assessed manually. The name algorithm can be useful for registry-based health research among Turkish migrants in Germany. Possible applications are e.g. in cancer registries to compare survival among German and Turkish cancer patients, or in health insurance registries to compare the relative importance of work-related degenerative diseases. In specific circumstances, the algorithm may also be useful in aetiological research.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

The health of migrants in Europe poses a growing challenge to health services and public health researchers alike. Migration to western Europe is steadily increasing, and many migrant workers, together with their families, have settled in the host countries. In spite of the now 40-year history of work-related migration, migrant populations still face particular social, economic and health disadvantages (Bollini & Siem 1995). They therefore require special public health attention. This applies, for example, to the 2.1 million people of Turkish nationality, many of them former ‘guest workers’ and their families, who are residing in Germany (Statistisches Bundesamt 1998).

To adapt services to the health needs of migrants, disadvantages in their health and health care provision need to be identified. These disadvantages may include differences in risk factor and disease profiles between migrant and host populations, or inequalities in the uptake of preventive interventions and in treatment outcomes. Public health researchers have different methodological options when investigating health disadvantages among migrants. Firstly, they could collect primary data in epidemiological studies specifically designed for this purpose. Yet, data collection may prove prohibitively expensive and so time-consuming that, given the dynamics of migration, study findings may be obsolete by the time they become available. Alternatively, researchers could exploit the large amount of data accumulated in registries of health-related events. Such routine sources of health information can provide valuable information on health status and on uptake and outcome of health care in defined populations (Williams & Wright 1998). In Germany, however, routine data bases often hold no, or inaccurate, information on the national origin of the cases registered (Razum & Zeeb 1998). Moreover, place of birth or nationality alone are no longer sufficient markers of Turkish descent as a second or third generation of Turks has been born in Germany, and a small but increasing proportion of (mainly young) Turks obtain German nationality (Statistisches Bundesamt 1998).

Researchers in Canada, the US and the UK were confronted with comparable problems when attempting to retrieve cases of South Asian, Chinese and Mexican American ethnicity from cancer registries. In response, they successfully screened family names or developed name-based algorithms (Hazuda et al. 1986; Nicoll et al. 1986; Huey-Huey Hage et al. 1990; Harland et al. 1997; Sheth et al. 1997; Martineau & White 1998; Stewart et al. 1999). We examine whether a similar approach can be pursued in Germany; in particular, (i) whether a data base of Turkish first and family names constitutes a sufficiently sensitive and specific tool to retrieve cases of Turkish descent from German registries and (ii) how this tool can contribute towards identifying and monitoring health inequalities faced by Turks in Germany.

Materials and methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

Definition and interpretation of ‘Turkish descent’

We define Turkish descent as family origin from the geo-politically defined nation of Turkey, irrespective of actual place of birth. We delimit Turkish descent from Turkish nationality according to passport, which can change when Turkish migrants take up German nationality; and from ethnicity, which would, in addition, imply a common genetic and cultural background. We interpret Turkish descent in Germany as a proxy for socio-economic disadvantage and minority status. Turkish family names should provide a specific marker for Turkish descent. They were introduced by law as recently as 1934 (Schimmel 1992) following a language reform to ‘purify’ the Turkish vocabulary of foreign elements (Anonymous 1994). They had to have a meaning in the Turkish language and be free of religious content, so they should be distinguishable from Arab–Muslim family names (see Appendix 1 for linguistic details).

Compilation of the name data sets

Visual inspection of names in large sets of records is time-consuming and prone to error. A better option is computerized matching of the records with a name list (Harding et al. 1999). An additional advantage is that names in the records and name lists can be encrypted to ensure data privacy. We compiled two independent sets of Turkish family names (data sets A and B) and one set of first names common in Turkey (data set C). Data set A was based on a complete list of family names of all 83 500 Turkish nationals residing in the federal state of Rhineland-Palatinate, Germany. Data set B was compiled from various sources in Ankara and comprised family names of 25 000 individuals. To ensure consistency of the data sets we converted the Turkish alphabet letters Ç, Ğ, İ and Ş to the corresponding Latin letters. We also converted the ‘umlaute’Ö and Ü to OE and UE in the Turkish and the German data sets. A computerized matching of data sets A and B with the family names of all four million German residents of Rhineland-Palatinate provided a provisional list of ‘doublets’, i.e. family names that could be either German or Turkish. Non-Turkish names were removed from this list by a Turkish researcher. Data sets A and B were then each checked by two German and one Turkish researchers; additional doublets were flagged and all remaining German family names were removed. Finally, we combined A and B to form a joint Turkish family name data set AB. A similar approach was used for the data set of first names, C. Re-coding and matching was performed using Excel 97 and Stata 6.0 (StataCorp 1999).

Completeness and discriminatory power of family name list

We performed a cumulative count of the Turkish population in Rhineland-Palatinate (data set A) by family name to estimate what proportion of Turkish individuals in a community can be identified by the more common Turkish family names. Also, we matched data set B to A to assess what proportion of the Turkish population in Rhineland-Palatinate can be identified by this partial name list (sensitivity). We then matched the joint data set AB separately with the family name data sets of 9466 Greek and 16 079 nationals of 16 Arab countries resident in Rhineland-Palatinate to estimate the proportion of false positive matches.

Performance of the name method in a German registry

We matched data sets AB and C with the family names/birth names and first names of all cases in the cancer registry of the Saarland, Germany. We have previously estimated the total number of Turkish cases in the registry to be 192, using reported nationality as a second, independent source in a capture-recapture approach (Razum et al. 2000). Given this information we here calculate sensitivity, specificity and positive/negative predictive value of the name method against the best available standard of Turkish descent. This standard comprises a visual inspection of all records that could possibly be Turkish, i.e. those with a first and/or family name that could possibly be Turkish, and/or with reported Turkish nationality.

Results

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

Completeness of the family name data set

After manual validation and exclusion of 1118 persons with non-Turkish names (1.34%), data set A comprised 82 425 Turkish individuals living in Rhineland-Palatinate who had 6460 different Turkish family names; data set B comprised 25 700 Turkish individuals living in Ankara who had 8370 different Turkish family names. The joint data set AB contained 12 188 different family names, of which 2642 were present in both data sets. When data set B was matched with the list of Turkish family names in Rhineland-Palatinate (data set A), 40.9% of all Turkish family names were retrieved, corresponding to 76.9% of all Turkish individuals living there. Table 1 shows the cumulative Turkish population figures of Rhineland- Palatinate, ranked by family name. The figures indicate that in this community setting, about half (56%) of the Turkish individuals can be identified by the 500 most common family names. At the same time, the observation that there are 5728 non-matching names in data set B indicates that the total number of existing Turkish family names must be far higher than 12 188. Names not contained in either data set will, however, be rare. This is evident from the observation that 5436 (64.1%) of the names in data set B occurred only once, yet these names corresponded to only 21.1% of the individuals in the data set.

Table 1.   Cumulative Turkish population count in Rhineland-Palatinate, ranked by family names Thumbnail image of

Among the 12 188 Turkish family names in the joint data set AB we identified 19 doublets (0.16%) which were present in both data sets A and B, e.g. Kaplan, Kurt and Türk. An additional 25 family names (0.21%) were identified as potentially Turkish or German but we found a small number of individuals by these names only in data set A. The total number of individuals with doublet family names was 1182 (1.1% of the 108 125 individuals in the Turkish sample).

Discriminatory power towards Greek and Arab names

When the family name data set AB was matched with the 4601 family names of all Greek residents of Rhineland-Palatinate, 59 matches occurred (equivalent to 0.48% of the Turkish family names in data set AB). More than 60% were unambiguously Turkish (=non-Greek) names, presumably belonging to persons of Turkish origin with a Greek passport. Cases with ambiguous family names can be categorized as Turkish/non-Turkish by first name and/or religion, both of which would be non-Muslim when the origin of the person is Greek.

We next matched data set AB with 5963 different family names of 16 079 residents from Arab countries. A total of 150 name matches occurred, affecting 920 (5.7%) of the Arab residents. More than half (53.5%) of the matches were because of a mix-up of first and family names among Turkish and/or Arab residents in the registry data sets; the remainder apparently were individuals with Turkish family names and Arabic passports, or with ambiguous or possibly misclassified family names. If the Turkish family name list were applied to the Turkish plus Arab population of Rhineland-Palatinate the proportion of false positive matches would be 1.1% [1–82 425/(82 425 + 920)].

Sensitivity, specificity and predictive value in a German registry

Table 2 shows separately the performance of each part of the name list when applied to a German cancer registry. When only cases with two Turkish name elements are sampled, a positive predictive value (PPV) above 98% can be achieved, with specificity and negative predictive value (NPV) both above 99.9%. However, the sensitivity is only about 40%, indicating that less than half of all cases in the registry will be retrieved. The gain in yield through including doublets (exclusively family names in our data set) is low, but Turkish origin can be ruled out without inspection of the case record when neither the first name nor religion are Muslim. Sensitivity can be substantially increased by reviewing cases with one Turkish name element. This, however, requires additional information such as religion or a visual inspection of the names on the record by a Turkish-speaking person (the ‘best available’ standard in our study), thus increasing workload. When all strategies are applied, the case yield can be increased to 85% of the estimated total number of cases in the registry; the positive predictive value would then by definition be that of the best available standard, i.e. 100%. Figure 1 summarizes the sampling process using the name algorithm.

Table 2.   Performance of the three criteria of the name algorithm against visual inspection of possibly Turkish cases in a German cancer registry Thumbnail image of
image

Figure 1.  Use of the name algorithm to sample Turkish cases from a registry.

Download figure to PowerPoint

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

We could demonstrate that a list of Turkish family and first names constitutes a highly specific tool to assign Turkish descent and to retrieve Turkish cases from a German registry in a computerized matching process. The name list we compiled, while large, is far from comprehensive; the number of Turkish family names is substantially higher than, e.g. that of Chinese family names (Huey-Huey Hage et al. 1990). As a consequence, the sensitivity of the list, if used alone and in a fully computerized matching process, is only around 40%. It must be assumed that even a more extensive list will not be sufficient to identify all individuals of Turkish descent as there appear to be many rare Turkish family names (see Appendix 1). The increase in cost of extending the list may not be matched by a similar increase in completeness, as can be seen from the relatively small increase in sensitivity of list AB over list B alone. However, by introducing religion as an additional criterion and accepting that selected records will have to be reviewed manually, the sensitivity, and thus the completeness of case retrieval, can be substantially increased (Hazuda et al. 1986; Sheth et al. 1997), in our study to 85%. This compares well with other studies (Martineau & White 1998; Stewart et al. 1999). The corresponding low positive predictive value in Table 2 should be interpreted as a loss in efficiency of the name algorithm; when the additional identification criteria are used, false-positive cases can be excluded. The specificity remains high, meaning that only a minute fraction of all cases in a German registry will have to be reassessed – in our example < 0.5%. Misclassification of descent will be rare and largely restricted to Arab cases considered as Turkish; given the current distribution of national origin among the migrant population in Germany, the error thus introduced remains small.

The study of health events among migrants is of interest in two fields: firstly, in social public health, with the aim to identify and ameliorate inequities in health among socio-economically disadvantaged or minority groups. Secondly, in aetiological research, by making use of differences in genetic makeup and in environment or exposure status. For example, immigrants from less developed countries have been exposed to risk factors associated with ‘western’ lifestyle for a shorter time period than a host population in an industrialized country.

In research on equity the name algorithm can be employed in registries of health-related events to assess whether there are inequalities in health care need, access, uptake, provision and outcome between Turks and Germans. Examples of such comparative analyses include:

•Morbidity profiles, e.g. relative importance of work-related degenerative diseases among all health problems treated, relative importance of particular cancers (measured as proportional incidence ratio); sources: health insurance registries, cancer registries;

•Uptake of preventive interventions, e.g. immunization coverage among school entrants, participation in ANC; sources: local health care authority files; ANC cards;

•Risk profiles, e.g. pre-existing disease among women delivering, risk factor prevalence among hospitalized cardiac cases; source: patients’ hospital records;

•Diagnostic and treatment process, e.g. accomplishment of quality criteria; source: patients’ hospital records;

•Treatment outcome, e.g. proportion of caesarean sections, case fatality of tuberculosis cases, survival among hospitalized cardiac cases; source: patients’ hospital records;

•Stage at the time of diagnosis, waiting time and survival time, e.g. among cancer patients and organ transplant recipients; sources: cancer registries, organ transplant registries.

Following the model proposed by Elkeles and Mielck (1997), we interpret differences in health needs, as well as inequalities in access to, and outcome of, health care between Turks and Germans as intermediate factors in the pathway from socio-economic to health inequity. Such differences, once identified, can help to delimit areas within the health care system that are in need of improvement and serve as indicators to evaluate the effect of interventions. By relying on registry data, results will become available quickly and can be updated regularly.

It should be noted that the examples above all relate events to a total case load, not to a general population denominator (i.e. risk rather than rates are measured). As the sensitivity of the algorithm is modest, i.e. not all Turkish cases are identified, population rates can only be estimated when the same algorithm that has been used to retrieve cases is also used to define the denominator population; otherwise, rate estimates will be erroneously low (Harland et al. 1997). Despite this problem, the name algorithm can be used to draw quasi-random population samples of Turkish descent from population registries, e.g. for case–control studies.

For aetiological research, the name algorithm is of more limited use because a name is a poor proxy for genetic similarity; in our study, Kurdish people from Turkey may have a genetic background different from that of Turks but will nevertheless bear Turkish family names.

The name algorithm has additional limitations that apply both to research on equity and aetiological research. Firstly, a family name does not allow to differentiate between recent and past immigration, information that is also lacking in health-related registries in Germany (Razum & Zeeb 1998). This is of relevance because recent or first-generation immigrants may have poorer access to health care than well-established second generation immigrants; the latter, in turn, may be more exposed to risk factors associated with ‘Western’ lifestyle. Secondly, the use of a name algorithm is restricted to particular national origins and host countries (Harding et al. 1999). Many (non-Turkish) Muslim family names, for example, are not specific to a certain country (Martineau & White 1998). The use of religion as an additional criterion is not feasible in countries where religion is not routinely registered, or where a substantial fraction of Turkish migrants are Greek or Armenian Orthodox Christians (>99% of the population of Turkey are Muslim, however). Finally, there will always be a number of individuals in any given registry whose descent cannot be categorized unambiguously by their name or whatever additional information may be available. Feasible and useful alternatives to identify population groups by descent are few; questions on ‘migrant’ status and place of birth would yield ambiguous answers as many Turks were born in Germany and never migrated (Razum & Zeeb 1998); ‘race’ is a criterion prone to inconsistent use (Hahn 1999) and misuse (Bhopal 1997); existing census categories do not match with perceived identity (Rankin & Bhopal 1999) whereas self-classification of ethnic group (Aspinall 1997) may lead to inconsistencies and impracticably large numbers of categories (McKenney & Bennett 1994).

Could the name algorithm be further improved? A revision of the existing list of family and first names by a linguist might reveal that a small number of names classified as Turkish are in fact of Arab origin. The development of a phonetic algorithm could help reduce the effects of typographical errors or misspellings as well as transcription errors (Friedman & Sideli 1992). In the long run, however, it would be desirable to separate the attributes ‘migrant’ and ‘low socio-economic status’, i.e. cease using one as a proxy for the other. Not all immigrants have, or will retain, a low socio-economic status, and the association between ethnic origin, socio-economic status and health is more complex than implied by this simplistic equation (Davey Smith 2000). Instead, more detailed data on immigration status, e.g. place of birth of case and of both parents, ethnic origin of grandparents (Hazuda et al. 1986), and socio-economic status should be collected in health statistics, so disadvantaged groups can be identified irrespective of their descent. Care needs to be taken that a registration of health outcomes by migrant status or ethnic origin does not merely emphasize negative aspects of the health of minority groups, thus giving rise to stigmatization, but that findings are actually applied to improve health services and reduce inequalities (Bhopal 1997).

Acknowledgements

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

We are grateful to Ms Kerstin Beck for help with compiling the family name data set. Part of our work was supported by a grant from the German Federal Ministry of Health, Kap. 1502, Titel 652 31, 1999.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix
  • 1
    Anonymous (1994) Turkish Language. In: Encyclopaedia Britannica, Britannica CD 98, Multimedia Edition, International Version.
  • 2
    Aspinall PJ (1997) The conceptual basis of ethnic group terminology and classifications. Social Science and Medicine 45 , 689698.DOI: 10.1016/S0277-9536(96)00386-3
  • 3
    Bhopal R (1997) Is research into ethnicity and health racist, unsound, or important science? British Medical Journal 314 , 17511756.
  • 4
    Bollini P & Siem H (1995) No real progress towards equity: Health of migrants and ethnic minorities on the eve of the year 2000. Social Science and Medicine 41 , 819828.
  • 5
    Davey Smith G (2000) Learning to live with complexity: ethnicity, socioeconomic position, and health in Britain and the United States. American Journal of Public Health 90 , 16941698.
  • 6
    Elkeles T & Mielck A (1997) Entwicklung eines Modells zur Erklärung gesundheitlicher Ungleichheit. Gesundheitswesen 59 , 137143.
  • 7
    Friedman C & Sideli R (1992) Tolerating spelling errors during patient validation. Computers and Biomedical Research 25 , 486509.
  • 8
    Hahn RA (1999) Why race is differentially classified on US birth and infant death certificates: an examination of two hypotheses. Epidemiology 10 , 108111.
  • 9
    Harding S, Dews H, Simpson SL (1999) The potential to identify South Asians using a computerised algorithm to classify names. Population Trends 97 , 4649.
  • 10
    Harland JO, White M, Bhopal RS et al. (1997) Identifying Chinese populations in the UK for epidemiological research: experience of a name analysis of the FHSA register. Family Health Services Authority. Public Health 111 , 331337.
  • 11
    Hazuda HP, Comeaux PJ, Stern MP et al. (1986) A comparison of three indicators for identifying Mexican Americans in epidemiologic research. Methodological findings from the San Antonio Heart Study. American Journal of Epidemiology 123 , 96112.
  • 12
    Huey-Huey Hage B, Oliver RG, Powles JW, Wahlqvist ML (1990) Telephone directory listings of presumptive Chinese surnames: an appropriate sampling frame for a dispersed population with characteristic surnames. Epidemiology 1 , 405408.
  • 13
    Jastrow O (1985) Die Familiennamen der Türkischen Republik. Bildungsweise und Bedeutung. In: Erlanger Familiennamen-Kolloquium (eds R Schützeichel & M Zender). Aisch, Neustadt, pp. 101–109.
  • 14
    Martineau A & White M (1998) What’s not in a name. The accuracy of using names to ascribe religious and geographical origin in a British population. Journal of Epidemiology and Community Health 52 , 336337.
  • 15
    McKenney NR & Bennett CE (1994) Issues regarding data on race and ethnicity: the Census Bureau experience. Public Health Reports 109 , 1625.
  • 16
    Nicoll A, Bassett K, Ulijaszek SJ (1986) What’s in a name? Accuracy of using surnames and forenames in ascribing Asian ethnic identity in English populations. Journal of Epidemiology and Community Health 40 , 364368.
  • 17
    Rankin J & Bhopal R (1999) Current census categories are not a good match for identity. British Medical Journal 318 , 16961696.
  • 18
    Razum O & Zeeb H (1998) Epidemiologische Studien unter ausländischen Staatsbürgern in Deutschland: Notwendigkeit und Beschränkungen. Das Gesundheitswesen 60 , 283286.
  • 19
    Razum O, Zeeb H, Beck K et al. (2000) Combining a name algorithm with a capture–recapture method to retrieve cases of Turkish descent in a German population-based cancer registry. European Journal of Cancer 36 , 23802384.DOI: 10.1016/S0959-8049(00)00333-6
  • 20
    Schimmel A (1992) Herr ‘Demirci’ heisst einfach ‘Schmidt’. Önel-Verlag, Köln.
  • 21
    Sheth T, Nargundkar M, Chagani K et al. (1997) Classifying ethnicity utilizing the Canadian Mortality Data Base. Ethnicity and Health 2 , 287295.
  • 22
    StataCorp (1999) Stata Statistical Software: Release 6.0. StataCorp, College Station, TX.
  • 23
    Statistisches Bundesamt (1998) Statistisches Jahrbuch für die Bundesrepublik Deutschland 1998. Statistisches Bundesamt, Wiesbaden.
  • 24
    Stewart SL, Swallen KC, Glaser SL, Horn-Ross PL, West DW (1999) Comparison of methods for classifying Hispanic ethnicity in a population-based cancer registry. American Journal of Epidemiology 149 , 10631071.
  • 25
    Williams R & Wright J (1998) Epidemiological issues in health needs assessment. British Medical Journal 316 , 13791382.
  • 26
    Zygusta L (1994) Family names. In: Encyclopaedia Britannica, Britannica CD 98, Multimedia Edition, International Version.

Appendix

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

Appendix 1: Linguistic background of Turkish family names

In 1928, Turkey replaced the Arabic script by the Latin alphabet (Anonymous 1994). Six years later, a law introduced Turkish family names for all Turkish nationals, including those of Kurdish descent. Old Persian or Arabic language elements in already existing family names were removed or literally translated into Turkish (Schimmel 1992). The new Turkish family names were not allowed to have a religious content and hence differ from family names in neighbouring Islamic countries. Some of the new family names have a similar derivation as those in western European countries (Zygusta 1994), describing occupation or place names, being patronymic or derived from first names. Many others portray or symbolize positive attributes such as beauty, strength or courage (Jastrow 1985; Schimmel 1992). Accordingly, Turkish family names have a meaning in the Turkish language. In western European countries, family names came into use between the 11th and 16th century (Zygusta 1994), and many have since undergone substantial changes in spelling. Modifications of family names over time are not yet a commonly observed phenomenon in Turkey, and only minor variations in spelling exist: the letter ‘a’ is in some instances exchangeable with ‘e’, the letter ‘o’ with ‘u’, and the letter ‘d’ with ‘t’ (Schimmel 1992). Marked regional preferences of family names have not been reported and are unlikely to persist in view of a massive rural-urban migration. In summary, Turkish family names should be easily identifiable as such.

Yet a practical problem arises, related to the morphology of the Turkish language. The primary word stem is expanded with different suffixes designating grammatical notions (Anonymous 1994) or attributes (Schimmel 1992). As a consequence, words, and hence also family names, with a considerable number of elements can be constructed. Schimmel (1992) illustrates this with the three Turkish names Mut, Mutlu, and Mutlukul; she cites Uzunagacaltindayataruyaroglu as an example of a family name generated by expanding the word stem with a particularly large number of suffixes. It follows that there is a potentially very large number of Turkish family names. In other words, it will not be possible to construct a complete list of all Turkish family names.

First names common in Turkey are far less specific as they often derive from Arab or central Asian roots (Schimmel 1992); or, like ‘Ali’ and ‘Mustafa’, are common in many Muslim countries. Christian first names, however, are very rarely used.