Bruise dating using deep learning

Abstract The bruise dating can have important medicolegal implications in family violence and violence against women cases. However, studies show that the medical specialist has 50% accuracy in classifying a bruise by age, mainly due to the variability of the images and the color of the bruise. This research proposes a model, based on deep convolutional neural networks, for bruise dating using only images, by age ranges, ranging from 0–2 days to 17–30 days, and images of healthy skin. A 2140 experimental bruise photograph dataset was constructed, for which a data capture protocol and a preprocessing procedure are proposed. Similarly, 20 classification models were trained with the Inception V3, Resnet50, MobileNet, and MnasNet architectures, where combinations of learning transfer, cross‐validation, and data augmentation were used. Numerical experiments show that classification models based on MnasNet have better results, reaching 97.00% precision and sensitivity, and 99.50% specificity, exceeding 40% precision reported in the literature. Also, it was observed that the precision of the model decreases with the age of the bruise.

| 337 TIRADO AnD MAURICIO used, and also consider variables such as sex, age, and skin color of the person. However, due to the variability of the evolution of these bruises [2], there is still no reliable method to determine their age. This is mainly explained by biological variability, like location, size, depth, and degree of the injury, as well as race of the subject. Also, the biological status, such as diabetes, hemophilia, and leukemia, could affect the appearance and healing of bruises. Likewise, studies in the United States and Europe mainly include white-skinned people in their experiments [7,8,9], which constitutes a different reality from other regions such as Latin America, where miscegenation is characteristic.
An exhaustive search in Web of Science, Scopus, and Google Scholar shows that to date there are no publications on bruise dating that use computer science techniques. However, there are artificial intelligence techniques that allow you to process images and differentiate for classification purposes. Publications show that the deep learning technique of artificial intelligence allows an accuracy of 90.16% for the diagnosis of glaucoma [10], 82.95% for psoriasis [11], and 82.3% for lung diseases [12]. In addition, in [13], it is applied to the diagnosis of melanoma, which affects the skin and presents variability in color, like bruise.
In this work, a deep learning model for bruise dating, based exclusively on images and convolutional neural networks, is proposed, for use on healthy living human beings only. MnasNet gave better results than the other three architectures evaluated for accuracy. In addition, it is optimized for use on mobile devices, so it must be small and fast, to allow a balance between accuracy and latency. To validate the model, the Tensorflow, Keras, and OpenCV libraries were used; then, tests were made with a dataset of 2140 images. Likewise, a protocol is proposed for capturing bruise photographs to guarantee image quality and high precision in the results.
This work is organized in five sections. In section two, a review of the literature on bruise dating is made. The bruise dating model and the photograph capture protocol are described in section three.
The validation of the model, through the implementation of a system considering six classes by age ranges and the "Healthy skin" class, is presented in section four. Finally, the conclusions, limitations, and recommendations are presented in section five.

| REL ATED WORK S
There are few works on bruise dating, and these focus on the fields of medicine, biology, and genetics. In forensic medicine, for example, in [6], the use of tristimulus colorimetric is proposed as a method to objectively determine the color of an bruise in dark-skinned people using the CIELAB color space, which reaches 95% accuracy of the color of bruise and that could be used for dating. In [15], tristimulus colorimetry is shown to be reliable for the evaluation of the color of a bruise generated experimentally with paintballs fired by compressed air guns. The use of a bilirubin meter is evaluated as a bruise dating method in [16], where it is found that the difference in bilirubin level between healthy skin and bruise has a peak between day 4 and 5, which decreases in the following days. An alternate light source, in the visible and ultraviolet spectrum, is used in [17] to evaluate its effectiveness in the detection of bruise, compared to white light.
Detection is a previous step to bruise dating. In medicine and biology, the method of histological analysis is used for the dating of bruise; however, in [4], it is shown that it is not reliable due to the high variability of the response of human tissue to trauma due to stroke. In the field of genetics, in [5], the use of genetic expression signatures is proposed as a method to determine the strength of the impact and the age of a bruise in pigs, where differences of +/-two hours are obtained for ages ranging from one to 10 hours, and, due to its physiological and immunological similarity to human skin, it is suggested to extrapolate the results of the study to humans.
An exhaustive review to January 2020, in Web of Science, Scopus, and Google Scholar, based on the use of "bruise dating" search strings, shows that there are no bruise dating works through computational techniques such as artificial intelligence and image processing.
However, there are works of image processing and artificial intelligence that have been developed for approximate problems to the dating of bruise. The "Relative Attribute SVM + Learning" algorithm is proposed in [18] for the estimation of age based on photographs of human faces; thus, it is considered that the presence of certain facial attributes at different ages keeps a relative order between age-groups. In [19], it is sought to determine the age and gender of a person, and for this, panoramic dental X-ray images are analyzed using image processing and a multilayer perceptron neural network. In another case, [20] proposes a deep hybrid model for classification by age range for human face images, where deep convolutional neural networks are used. The use of a Deep Belief Network, based on rough set theory for the classification of medical images of lung scans, is proposed in [12]. A new algorithm called "Ensemble Margin Instance Selection" (EMIS), based on Random Forest, is proposed in [21], to select the most informative data to optimize the classification of white blood cells. Finally, [22] proposes the use of a convolutional neural network to detect the gender (male or female) of a person based on a photograph of their eyes taken with the front camera of a smartphone, in everyday conditions with a normal camera. For the above, image processing and artificial intelligence could be used for bruise dating.
To obtain better results, the image preprocessing process, which involves the segmentation of the area of interest, is included in the image processing methods. [23] proposes using a deep convolutional neural network (DCNN) for the separation of the front and the bottom of an image, with a mean square error of 3.53%. [10] proposes an approach to the automatic diagnosis of glaucoma called "Super pixels for semi-supervised segmentation" (SP3S) using segmentation, with an F-score of 86.43%. [11] uses a deep convolutional neural network for the segmentation of skin psoriasis biopsy images, differentiating the dermis, epidermis, and non-tissue regions, where 89% accuracy is achieved.
In relation to the bruise dating, the medical literature reports the use of temporal scales based on the coloring of the bruise to estimate its age. One of the pioneering scales is that of camps, which establishes a scale of levels, where the color of the bruise is red immediately after being inflicted, then it becomes dark purple or black, it turns green between the fourth and fifth day, yellow between day seven to 10, and disappears after 14 or 15 days. From there, various color scales have been established for bruise dating.
A literature review on bruise dating scales until 1991 is performed in [2]. Table 1 shows four color scales for bruise dating, widely used in the literature. The scales are similar in terms of the sequence of changes in the color of the bruise, but differ in the times of these changes, although they all end with the green, then yellow color.

| B RUIS E DATING MODEL
A bruise dating model using deep learning is proposed, which allows the age of a bruise to be determined based on a photographic image of it, in living human beings. Its purpose is to determine the age of a bruise in a more accurate, objective, and quicker way, compared to the dating of bruise made by a human specialist (coroner and dermatologist). The main components are the protocol for image capture, image preprocessing, and the trained classification model based on convolutional neural networks.
In Figure 2, the bruise dating model receives as input a photograph of a bruise that is obtained through a camera respecting a protocol. The image is then preprocessed to obtain a clean and segmented image of the bruise. This is then processed by the classifier that implements a previously trained convolutional network model, with which the estimated age of the bruise is determined, this being the result of the model.
The use of convolutional neural networks is justified because the dating of bruise has low accuracy rates, thus reaching 40% for bruises less than 48 hours, a percentage that decreases as the age of the bruise increases [7]. In addition, they present good results, comparable to medical specialists, for similar problems such as sarcoma [24] and melanoma [13] (they affect the skin and its diagnosis is visual based on images). Therefore, the objective of this study is to build a bruise dating model that manages to exceed the accuracy reported in the literature.

| Protocol for image capture
In order for the bruise dating model to estimate the correct age of the bruise, the photographs must be captured following a series of steps defined under specific conditions to guarantee the quality of the bruise photographs, which constitute the main and only characteristic used in this study. Table 2 describes the protocol for bruise image capture.

| Data preprocessing
The input, both for the classification process and for the learning process, is the photographs of the bruise captured following the protocol and digitized in a repository. These images are preprocessed using the binarization of the grayscale image to segment the bruise, calculate the centroid of the bruise and trim the image to a size of 400 × 400 pixels.
The steps in this process are as follows: 1. Convert a copy of the original image to grayscale. In this way, the original photographs are preprocessed, and a square-shaped image is obtained with the bruise centered in it. To do this, a script was developed using the Python programming language and the OpenCV library. Figure 3 shows an original and preprocessed photograph, the latter of 400 × 400 pixels, with the centroid of the bruise in the center of the image.

| Learning
The learning model for bruise classification by age range is based on convolutional neural networks. The input is the photograph, captured Transfer the photographs from the camera to a folder on the computer where it will be processed by the bruise dating model P13 Avoid reducing the size of the files or losing the resolution of the images during the transfer of the files following the protocol, and the actual age of the injury. The photographs must be previously preprocessed and organized in folders, according to the age (in days) of the injury. In addition, the classes to be used must be established (see, e.g., the scales in Table 1), and the images must be grouped into folders according to these classes. In case the dataset is not balanced, it is suggested to use the data augmentation technique, which will allow greater precision [25].
Previously, 10% of the images for each class should be set apart to be used as the "test" dataset, to prevent data leakage, and to avoid using test data during training.
In Figure 4, the experimentation cycle to find the best bruise dating model is shown. This cycle is repeated as many times as

| Classification
The convolutional neural network model that obtained the highest precision during the learning phase can be used to classify new photographs of bruise. The model consists of a file that contains the structure, weights, thresholds, and parameters of the network.
The classification model can be implemented as an API (Application Programming Interface), to be consumed by a mobile or web application, or embedded in an off-line mobile application, for bruise classification.
The input of the classification model is a bruise photograph, and the output is a probability distribution that indicates the bruise belongs to one of the established classes.

| VALIDATION
The validation process of this study consists in conducting numerical experiments. For this, four DCNN models were trained with the dataset detailed in section (dataset). Then, the learning models resulting from each neural network were evaluated using the metrics indicated in section (metrics).
The validation was applied to the Peruvian case, where the levels of violence against women (see Figure 5) reach 68.2%, and 31.7% for physical violence [29], a percentage slightly lower than the world average that reaches 31.9% (see Figure 1), and the majority of the population is mestizo, with a skin color that is not black or white. In addition, it should be considered that most research includes only white-skinned people [2,7,8,9,15,30,31,32], and there is a study that indicates that yellow coloration of a bruise is not visible in people with dark skin [6].

| Dataset
This study requires the construction of a dataset big enough to train a neural network and classify bruises according to their age.
For this, a controlled experiment was carried out using a bruise generation method, like the one used in [15] and [17]. Two paintball matches were held, with a difference of 30 days between them.
The game consists of firing paintballs with compressed air guns.
In this scenario, players often get bruises, despite safety measures such as helmets, vests, protectors, and power limits of weapons.
In total, 11 volunteers (one participated in both matches) of mixed skin (four women and seven men), between 25 and 68 years, took five daily photographs of bruise following the data capture protocol  This way, over a period of 60 days, bruise photographs were collected and a dataset was built with the characteristics detailed in Table 4, which includes the number of images that were used for training, validation, and testing of the model, for each of the classes used by forensic doctors in Peru [33]. The dataset is available on request.
The total number of photographs estimated for the experiment is 5400; however, photographs of bruise on fingers are excluded, since the photographs contain more than one bruise. In addition, despite the established protocol, some participants did not submit the photographs daily, or the photographs presented shadows or blur, so a total of 2140 photographs were collected, of which 2021 contain a bruise, and 119 photographs show healthy skin. In addition, it has been observed that bruises in the two people with darkest skin were visible until the fifth day, while in the two people with lightest skin they were visible even until day 30.

| Implementation
The Python programming language and TensorFlow, Keras, pan- In the case of cross-validation, the dataset was divided into 10 groups, which were used for training and validation. Previously 10% of the photographs by each class were separated for testing the models. In addition, the stochastic gradient descent algorithm was used for the optimization of Inceptionv3, Resnet50, and MobileNet models, with a learning rate of 0.0001, and batch size equal to of 32.
The models were trained with 100, 200, and 1000 times. If the validation accuracy stopped improving for three consecutive epochs, the training was stopped, and another variant was tested.
In summary, 20 models of bruise dating were trained using four variants of neural networks, with or without cross-validation, transfer learning, and different number of training epochs (see Table 5). The MnasNet model architecture and its parameters were determined by Google's AutoML Vision service, which makes an automated search for a neural network architecture optimized for mobile devices in terms of accuracy and latency [14]. and the use or not of data augmentation. The data augmentation for M20 was obtained by duplicating the data of the "Healthy skin" class.

| Metrics
The following metrics were used to evaluate and test the learning models: • Precision (PRE). Rate of instances classified correctly.
• Sensitivity (SEN). True-positive rate, that is, values classified as positive when they are positive. Correctly identify instances within a class.
• Specificity (SPE). True-negative rate, that is, values classified as negative when they are negative. Correctly identify instances that do not belong to a class.
In this work, the goal is to obtain a model with high precision and sensitivity, since the most important thing is to classify a bruise correctly according to its age. Achieving high specificity is not a priority in this case, but it would be convenient to achieve a balance between sensitivity and specificity. Table 6 shows the precision, in training and validation, obtained by the 20 models indicated in Table 5.

| Results
Overfitting was presented for the M1-M16 models, which is explained because the validation accuracy is much lower than the training accuracy. On the other hand, the models that present greater precision in the validation are M17-M20, based on the MnasNet model (see Table 5). These models were exposed to a more detailed analysis, as shown in Table 7, to determine the best one for bruise dating. Table 7 shows that M19 has the highest average precision, sensitivity, and specificity, being the model chosen for bruise dating.
As can be seen in the Confusion Matrix ( The parameters were determined by Google's AutoML Vision service. b All classes except "Healthy skin." c All classes except "More than 17 d" and "Healthy skin." A limitation of the current results is that they are based on images obtained in a controlled experiment and heterogeneous context, to guarantee the accuracy of the information, given the difficulty to get bruise images of cases of violence. A future work is extending the proposed model for some aspects of the physical violence, such as the used object, intensity, and geographical location of the event of violence.