Application of the Face2Gene tool in an Italian dysmorphological pediatric clinic: Retrospective validation and future perspectives

Neurodevelopmental disorders exhibit recurrent facial features that can suggest the genetic diagnosis at a glance, but recognizing subtle dysmorphisms is a specialized skill that requires very long training. Face2Gene (FDNA Inc) is an innovative computer‐aided phenotyping tool that analyses patient's portraits and suggests 30 candidate syndromes with similar morphology in a prioritized list. We hypothesized that the software could support even expert physicians in the diagnostic workup of genetic conditions. In this study, we assessed the performance of Face2Gene in an Italian dysmorphological pediatrics clinic. We uploaded two‐dimensional face pictures of 145 children affected by genetic conditions with typical phenotypic traits. All diagnoses were previously confirmed by cytogenetic or molecular tests. Overall, the software's differential included the correct syndrome in most cases (98%). We evaluated the efficiency of the algorithm even considering the rareness of the genetic conditions. All “common” diagnoses were correctly identified, most of them with high diagnostic accuracy (93% in top‐3 matches). Finally, the performance for the most common pediatric syndromes was calculated. Face2Gene performed well even for ultra‐rare genetic conditions (75% within top‐3 matches and 83% within top‐10 matches). Expert geneticists maybe do not need computer support to recognize common syndromes, but our results prove that the tool can be useful not only for general pediatricians but also in dysmorphological clinics for ultra‐rare genetic conditions.

in faces, as well as other minor anomalies, for example on hands or foot.This skill requires specialized knowledge and a long clinical experience in order to be exposed to rare and ultra-rare genetic conditions (Marwaha et al., 2021).
Several genetic databases have been developed to associate a list of clinical characteristics (for example, congenital anomalies, growth alterations, or neurodevelopmental disorders) using Human Phenotype Ontology (HPO) to each genetic condition (Gene Ontology Consortium et al., 2000).Most popular are POSSUM (Pictures Of Standard Syndromes and Undiagnosed Malformations) (Bankier & Keith, 1989), the London Dysmorphology Database (Evans, 1999), the search routines available with the Online Mendelian Inheritance in Man (Hamosh et al., 2005) and Phenomizer (Köhler et al., 2009).Clinicians can enter one or more features of the patient and the software presents with a list of candidate diagnosis (Köhler et al., 2009).However, the use of these applications in dysmorphology is limited by the fact that the choice of HPO terms related to specific facial dysmorphisms is strictly based on clinician's personal experience.
Technology tried to provide an answer to this issue as well and today physicians can be supported in the clinical practice by computer-aided facial phenotyping tools (Hsieh et al., 2022).Face2-Gene (Face2Gene, FDNA Inc., USA) is one of the most popular online applications that analyses two-dimensional (2D) frontal facial photographs and automatically elaborates a prioritized list of 30 syndromes with similar gestalt (Gurovich et al., 2019; Figure 1).The pattern recognition is powered by a deep convolutional neural network (DCNN) that is referred to as DeepGestalt algorithm.Currently, Face2Gene can compare a portrait to about 300 different syndromic phenotype models (Gurovich et al., 2019).
Previous studies evaluated Face2Gene performance in different contexts.DeepGestalt was tested in an experiment using 502 pictures collected from clinical cases and publications and achieved 91% accuracy in presenting the correct diagnosis in the top ten list (top-10 accuracy) (Gurovich et al., 2019).Several studies assessed the Face2-Gene performance for specific genetic conditions, in particular for Cornelia De Lange syndrome (Basel-Vanagaite et al., 2016;Latorre-Pellicer et al., 2020), Pallister Killian syndrome (Liehr et al., 2018), Down syndrome (Mishima et al., 2019;Vorravanpreecha et al., 2018), PMM2-Congenital Disorder of Glycosylation (Martinez-Monseny et al., 2019) and inborn errors of metabolism (Pantel et al., 2020), concluding that Face2Gend can be a valid support in diagnostic workup.Few physicians all around the world evaluated the performance of the algorithm in groups of patients with different molecular diagnosis to assess the real accuracy of Face2Gene in clinical practice (Elmas & Gogus, 2020;Marwaha et al., 2021;Mishima et al., 2019;Narayanan et al., 2019;Pascolini et al., 2022;Zarate et al., 2019).
In this study, we assessed the performance of Face2Gene in facial gestalt recognition in a large Italian cohort of children with a con- Genetic conditions were categorized according to the rareness of the syndrome.We classified as "common" diagnoses those rare condition whose frequency was lower 5:100000 but higher than 1:100000 and as "ultra-rare" diagnoses those conditions whose frequency was lower than 1:100000 (Richter et al., 2018).
Portraits were uploaded anonymously on Face2Gene and analyzed without additional clinical features.
The tool suggested a prioritized list of 30 syndromes with similar facial morphology; for each condition in the list, the tool provided a "Gestalt score" (high, medium, or low).According to the position of the correct diagnoses in the list, patients were classified into four groups.Group A: patient's diagnosis included in the first three conditions suggested by Face2Gene.Group B: patient's diagnosis included between the 4th and the 10th suggested conditions.Group C: patient's diagnosis included between the 11th and the 30th suggested conditions.Group D: absent diagnosis in the list.We calculated the percentage of cases in which the correct diagnosis was respectively in the first 3, 10, and 30 conditions suggested by the software (top-3, top-10, and top-30 accuracy).
We analyzed the tool's performance according to the rareness of the genetic conditions and for the most frequent phenotypes in our group of patients (Cornelia de Lange syndrome, Kabuki syndrome, Williams syndrome, Wolf Hirschhorn syndrome).

| RESULTS
Face2Gene presented the correct diagnosis in the list in 142 out of 145 cases (98%; top-30 accuracy).The right patient's diagnosis was included in the first 3 suggested diagnoses in 130 cases (Group A, 90%; top-3 accuracy), between the 4th and the 10th suggested diagnosis in 9 cases (Group B, 6%), between the 11th and the 30th suggested diagnosis in 3 cases (Group C, 2%).Face2Gene failed in presenting the correct diagnosis only in 3 cases (Group D, 2%), although the syndromes were included in the database of the algorithm.These 3 patients were affected by Arboleda-Tham Syndrome (2 patients) and Verheij Syndrome (1 patient).
In group A the Gestalt score was high in 98 out of 130 patients (75%), medium in 22 patients (17%), and low in 10 patients (8%).In group B there were no patients with a high Gestalt score, but 2 out of 9 patients were classified with a medium Gestalt score (22%) and 7 patients with a low Gestalt score (78%).In group C the Gestalt score was low in the whole cohort (100%).

| Analysis according to the rareness of the syndrome
Results are summarized in Table 1.
Rare diagnoses.Face2Gene suggested the correct diagnosis in the list for all 121 patients affected by a rare condition (100%; top-30 accuracy).The right diagnosis was included in the first 3 suggested diagnoses in 112 cases (Group A, 93%; top-3 accuracy), between the 4th and the 10th suggested diagnosis in 7 cases (Group B, 6%), between the 11th and the 30th suggested diagnosis in 2 cases (Group C, 2%).In group A 91 out of 112 patients received a high Gestalt score (81%), 16 patients a medium Gestalt score (14%), and 5 patients a low Gestalt score (5%).In group B there were no patients with a high Gestalt score, but 2 out of 7 patients were associated with a medium Gestalt score (29%) and 5 patients with a low Gestalt score (71%).In group C (2) the Gestalt score was low in the whole cohort (100%).
Ultra-rare diagnoses.24 patients were included in this group.For 21 out of 24 patients (88%, top-30 accuracy) the right diagnosis was included among those supposed by Face2Gene, but for 3 patients (Group D, 13%) the right diagnosis was not suggested.In 18 cases the patient's condition was included in the first 3 suggested diagnoses (Group A, 75%; top-3 accuracy), in 2 cases (Group B, 8%) between the 4th and the 10th one and in only one case between the 11th and the 30th (Group C, 4%).Evaluating the Gestalt score, in group A it was high in 8 out of 18 patients (44%), medium in 6 patients (33%), and low in 4 patients (22%).In group B and in group C the Gestalt score was low in 100% of the patients.

| Analysis for the most frequent phenotypes
Results are summarized in Table 2.
Cornelia de Lange syndrome.In all 45 patients affected by this syndrome the right diagnosis was supposed by Face2Gene (100%, top-30 accuracy).42 patients (93%, top-3 accuracy) were included in group A, 2 patients were classified in group B (4%), only one patient was in group C (2%).All 22 patients with a severe phenotype were included in group A and the Gestalt score was high in 20 out of them (91%), medium in one case (5%) and low in one case (5%).23 patients showed a mild phenotype: 20 out of them were presented in group A (87%), 2 in group B (9%) and only one in group C (4%).Evaluating the Gestalt score associated with a mild phenotype, it was high in 17 patients (74%), medium in 2 patients (9%), low in 4 patients (17%).
Kabuki syndrome.26 patients affected by this syndrome were included in this study and in all cases the right diagnosis was suggested by Face2Gene (100%, top-30 accuracy).24 patients (92%) were included in group A and 2 patients were classified in group B (8%), resulting in a 100% of top-10 accuracy.In group A the Gestalt score was high in 21 out of 24 patients (81%), medium in 2 patients (8%) and low only in a patient (4%).In group B the Gestalt score was low in 100% of patients.The Gestalt score in Group A was high in 4 out of cases (50%) and medium in 4 cases.In group B the Gestalt score was medium in a patient (50%) and low in the other one (50%).

| DISCUSSION
The study evaluated the performance of Face2Gene tool in recognition of facial gestalt in a large pediatric cohort.
Overall, the software presented the correct syndrome in most cases (98%) with high diagnostic accuracy, as the correct syndrome was included in the first three suggested conditions in 90% of cases.
Positively, when the tool did not present the correct diagnosis, it never suggested an alternative misdiagnosis with high Gestalt score.
All common diagnoses have been correctly proposed by Face2-Gene, most of them included in the first 3 suggested syndrome (93%).
In particular, we assessed that the tool showed excellent recognition ability for Cornelia de Lange syndrome, Kabuki syndrome, Williams syndrome, and Wolf Hirschhorn syndrome, which are among the most frequent pediatric genetic conditions with a typical gestalt.An expert geneticist generally does not need computer support to recognize these syndromes, but the software could be helpful in cases with subtle dysmorphism.However, the ability to recognize common conditions may be especially helpful to other professionals, such as pediatricians, neuropsychiatrists, therapists: in the presence of a child with peculiar facial features, the software can suggest the usefulness of a genetic clinical evaluation.
On the other hand, in our study Face2Gene showed a good performance in recognition of facial gestalt also in children affected by rare and ultra-rare conditions (75% within top-3 matches and 83% within top-10 matches).This good result suggests that the tool can also be useful to geneticist consultants as a long clinical experience is required to be exposed to rare and ultra-rare genetic conditions.Face2Gene can guide the choice of the most appropriate genetic test but also can be useful to clarify the real pathogenicity of variants of T A B L E 1 Face2Gene diagnostic accuracy in our cohort according to the rareness of the syndrome.uncertain significance identified with broad-spectrum genetic analysis, as Whole Exome Sequencing.
The top-10 accuracy (96%) was higher than in previous studies.
Mishima analyzed Face2Gene performance in a cohort of 74 Japanese patients with 47 congenital dysmorphic syndromes (Mishima et al., 2019) and the correct syndrome was identified within the top 10 suggested list in 86% of cases.Narayanan found that the software predicted the correct diagnosis in 70% out of 37 Indian children with a definite molecular or cytogenetic diagnosis and recognizable facial dysmorphism (Narayanan et al., 2019) ), where the tool's performance is higher; in our country the rate of consanguineous marriages is not high and consequently the prevalence of ultra-rare recessive conditions is very low.In addition, our cohort is enriched by patients affected by Cornelia de Lange Syndrome, a condition with a very typical facial dysmorphisms, as the study was performed in an Italian reference center for this syndrome.
Finally, our cohort is made up of children of Caucasian ethnicity, a population for which the software has been more trained.

| CONCLUSION
Our retrospective validation confirms the high performance of this technology in recognition of facial gestalt of syndromic conditions and assess that this tool could be useful both for expert genetic consultants as "non-professionals." The high accuracy in our Caucasian cohort, especially for common conditions, suggests that the performance of Face2Gene can be improved by training the software to recognize even ultra-rare conditions and to analyze also ethnic traits of different populations.Dysmorphology centers around the world are invited to collaborate to this objective collecting and sharing patients' portraits with anonymous and secure systems in compliance with the rules of privacy.
Following prospective studies are needed to assess the clinical usefulness of Face2Gene in clinical practice, for example assessing the impact of using the tool to shorten the time of the diagnostic process and, on the other hand, the risk of incurring a wrong diagnosis.
Technological evolution puts new important instruments in our hands: we must use them responsibly for the wellness of our patients.
firmed molecular or cytogenetic diagnosis referred to a dysmorphological pediatric clinic.The efficiency of the algorithm was evaluated also considering the rareness of the genetic syndrome.Finally, the performance of the face analysis technology was assessed referring to specifical conditions with typical dysmorphic features, including F I G U R E 1 (a) Face2Gene Graphical User Interface.The software asks to upload a frontal facial photo, with or without additional phenotypic features.(b) DeepGestalt algorithm.The software converts the patient's photo into mathematical descriptors and draws up a comparison with facial models for which the algorithm was previous trained.The graphical heatmap visualizes the degree of similarity between the photograph of a patient affected by Myhre syndrome and the corresponding composite image.(c) Face2Gene CLINIC's RARE tab.The software elaborates a prioritized list of 30 syndromes with similar gestalt.Myhre syndrome is suggested as first condition with high gestalt score.Cornelia de Lange syndrome, Kabuki syndrome, Williams-Beuren syndrome, and Wolf-Hirschhorn syndrome.2 | MATERIALS AND METHODS The retrospective study was managed in Sant'Anna Hospital's Pediatrics Clinic, a dysmorphological pediatric clinic in San Fermo della Battaglia, Como, Italy.The investigations were carried out in accordance with the principles laid down in the 2013 revision of the Declaration of Helsinki.Pediatric patients affected by genetic conditions whose facial model was available on Face2Gene were enrolled.A confirmed cytogenetic or molecular diagnosis was necessary.Photographs were taken during the follow-up medical genetic examination, from November 2021 to July 2022, upon parents' informed consent.The portrait needed good quality and a complete frontal face shape from hairline to chin showing both eyes.We selected 145 patients, 92 (63%) male patients and 53 (37%) female patients, of different ages (ranged from 0 to 18 years old, mean age 7 years old) who met all inclusion criteria.Included genetic syndromes with the corresponding number of patients in brackets were the followings: Cornelia de Lange Syndrome (45), Kabuki Syndrome (26), Williams-Beuren Syndrome (21), Wolf-Hirschhorn Syndrome (10), Angelman Syndrome (6), Charge Syndrome (5), Rubinstein-Taybi Syndrome (5), Koolen De Vries Syndrome (4), KBG Syndrome (4), Noonan Syndrome (3), Sotos Syndrome (2), Arboleda-Tham Syndrome (2), Mandibulofacial Dysostosis with Microcephaly Syndrome (2), Kleefstra Syndrome (1), 22q11.2microdeletion Syndrome (1), Cohen Syndrome (1), Silver Russell Syndrome (1), Smith-Magenis Syndrome (1), Alpha-Thalassemia X-linked Intellectual Disability Syndrome (ATRX) (1), Myhre Syndrome (1), Verheij Syndrome (1), Glass Syndrome (1), Xia Gibbs Syndrome (1).
Williams-Beuren syndrome.In all 21 patients affected by this syndrome the right diagnosis was in the list.20 patients (95%, top-3 accuracy) were classified in group A and only a patient was included in Group B (5%), resulting in a 100% top-10 accuracy.Evaluating the Gestalt score, in group A it was high in 16 out of 20 patients (80%) and medium in 4 patients (20%); the only patient in group B had a low Gestalt score.Wolf-Hirschhorn syndrome.10 patients affected by this syndrome were analyzed and the right diagnosis was proposed by Face2Gene in all cases.8 patients were in group A (80% top-3 accuracy) and 2 patients in group B (20%), resulting in 100% of top-10 accuracy.