Deep learning‐based skin care product recommendation: A focus on cosmetic ingredient analysis and facial skin conditions

Recommendations for cosmetics are gaining popularity, but they are not being made with consideration of the analysis of cosmetic ingredients, which customers consider important when selecting cosmetics.

for personalized recommendations, is widely utilized in the cosmetic industry.Such approaches enable users to receive personalized cosmetic recommendations through an analysis of their past behaviors and preferences.
On the other hand, a number of researchers have focused on the analysis of cosmetic components and the evaluation of user skin conditions to more effectively respond to user demands.Nakajima et al. 6 and Rubasri et al. 7 incorporated cosmetic ingredient information into their cosmetic recommendation systems.They applied document term matrix (DTM) and term frequency-inverse document frequency (TF-IDF) algorithms to find patterns of ingredients in numerous cosmetic products.Whereas, Li et al. 8 directed their attention towards AI skin analysis to enhance personalized recommendation systems.They utilized image processing algorithms and a deep learning model called YOLOv4 9 to identify skin problems and assess the condition of the user's skin.However, they did not present sufficient evaluations on their models, and they did not show how to use ingredients to identify product efficacy.Such approaches helped customers get a better understanding of both the cosmetics and their own skin conditions, but they had limitations on their effectiveness and reliability.
In this study, we propose a comprehensive cosmetic recommendation system that analyzes both the ingredients in cosmetics and the facial skin status of each individual.The reason we focused on ingredient analysis is that ingredients are the actual components that constitute cosmetics, allowing us to address the effects resulting from the actual contents of a product beyond the marketing tag expressed in a brief phrase.Actually, since the pandemic, there has been an increased interest in health and safety, leading users to scrutinize ingredients themselves and then purchase cosmetics. 10This indicates that the ingredient-based approach can provide more precise and reliable recommendations to today's users.The integration with AI skin analysis completes a personalized recommendation system for each user.As far as the authors are aware, existing AI applications for cosmetic recommendations either rely solely on user reviews or do not perform skin analysis.While there are applications that capture facial images, it is not explicitly mentioned whether they analyze the ingredients of cosmetics.
The ingredients are listed by their proportions in the cosmetic product [11][12][13] and we want to find patterns in them.Our method involves treating the list of cosmetic ingredients as sequential data, which enables us to predict the probabilities of various potential effects of the cosmetics.To extract features from a sequentially arranged list of ingredients, we constructed a deep neural network encoder called Transformer encoder, 14 and attached it to a multilabeled classifier to get the probability of multiple efficacy.
For skin analysis, we have designed a deep neural network to estimate grades for six renowned skin concerns: pores, redness, acne, wrinkles, pigmentation, and dark circles, which are shown in Figure 1.We implemented deep learning networks based on the U-Net architecture, and we designed them to simultaneously segment similar concerns.In addition, we defined skin type as 12 types using user surveys, which is inspired by Leslie Baumann's 15 approach, who identified skin type as 16 types.Figure 2 shows Baumann skin types, and we referenced her approaches to identifying the oiliness, dryness, and sensitive characteristics of human skin.This paper is organized as follows.In Section 2, we describe the approaches and datasets used in this study.Section 3 presents the experiments and results.Evaluations and discussions about the approaches are provided in Section 4.

| MATERIAL S AND ME THODS
In this section, we present how we configure our cosmetic dataset and what the approaches are for the ingredient analyzer, skin analyzer, and recommendation module.We collected cosmetic data from several cosmetic websites, companies, and associations and refined them for use in our model.We selected cosmetic products that can be matched to our skin analyzer, which analyzes skin pores, redness, acne, wrinkles, pigmentation, and dark circles on human faces.In addition, we included products that fall into one of these categories: cleanser, skin/toner, serum, moisturizers/cream, and special care to cover the most popular and common products.To ensure the quality and consistency of the data, we standardized the original data into a JSON format.Furthermore, we cross-checked the contents obtained from multiple sources, which resulted in a reduction of the total dataset from 23000 to 8000.This data preprocessing has increased the reliability of the data.Facial skin images were collected from LUMINI Kiosk V2® to train each model of skin analyzer.The skin image analyzer and ingredient analyzer are constructed independently, and their results are combined into the recommendation module.The entire structure of the skin and ingredient analyzer is illustrated in Figure 6.

| Cosmetic dataset
There are numerous cosmetic ingredients and products in the world, estimated to exceed 0.02 and 0.2 million, respectively.Some cosmetic ingredients are extracted from nature, while others are formulated from chemical compounds.Glycerin, hyaluronic acid, retinol, and ceramides are well-known ingredients, and the efficacy of cosmetics is determined by the inclusion and combination of ingredients in each product.However, it is impossible to conduct experiments to discover the actual effects of all the ingredients and their combinations.In addition, the specific quantities of these ingredients in a product are kept confidential, as they constitute proprietary information for the cosmetic companies.This required us to seek alternative approaches, and we have found clues in the ingredient notation rule.Cosmetic products list their components, and most countries have regulations for this notation.There is typically a rule about the order in which ingredients are notated on cosmetics, generally dictating that ingredients should be listed in descending order of their quantities.Figure 3 shows how ingredients are notated on a product.
Our cosmetic dataset consists of 6500 samples from the training set and 1500 samples from the validation set, which follow the ingredient notation rules stated before.Each data point corresponds to a cosmetic product and includes elements such as product ID, category, listed ingredients, target skin types, and marketing tags.Our cosmetic dataset contains 11 k distinct ingredients among 20 k entire ingredients, and most of them are parsed with commas (,), slashes (/), or middle dots (•).However, due to the presence of these special characters in some cosmetic compounds, which can complicate tokenization, we standardized all parsing to use semicolons (;) as separators.
We extracted target labels from the marketing tags on the products.These tags indicate the expected cosmetic effects on the consumer's skin; most of these effects are supported by clinical tests and regulated by organizations such as the Food and Drug Administration (FDA), the European Medicines Agency (EMA), or the ministry of food and drug safety to ensure their effectiveness.
Several products can have the same effects according to marketing tags, but the lists of their cosmetic ingredients may not have the same components.We selected 18 cosmetic effects from the marketing messages, which resulted in the removal of some effects like cooling and fragrance.Classes in our cosmetic dataset exhibit an overall imbalance in the distribution of each cosmetic effect, as shown in Figure 4.
To align with skin analyzer and to against small size of cosmetic dataset, we diminished classes as 6 classes from 18 classes.We directly mapped each marketing tag to six skin problems, and some of the tags are applied to multiple skin problems and some are ignored.
Tables 1 and 2 provide comprehensive list of the components of cosmetic effects derived from marketing tags and the skin analysis module.In addition, we applied weighted binary cross entropy loss due to the relatively large number of classes with imbalanced distributions, which loss can be helpful in overcoming class imbalance problems.We assigned weight differently to each class by amounts of products the class has.It allows the loss function to be more focused on relatively few classes.To give bigger weights to the small classes, we defined a weight inversely proportional to the number where  is the total number of skin concerns and  is the total number of the class. and  represent the ground truth and the predicted probability for class  of skin concern , respectively. denotes the user-defined hyperparameter.

| Skin image analyzer
Facial skin analysis, primarily conducted using frontal face photographs, plays a role in evaluating the skin's health status and diagnosing skin conditions.The advent of deep learning has brought significant performance improvements in facial skin analysis as well.
In terms of segmentation tasks, Convolutional Neural Network (CNN) based approaches allow for accurate segmentation of target attributes when provided with the entire face image as input under various shooting conditions. 16,17[20][21] In this study, we utilized U-Net-based segmentation networks in LUMINI Kiosk V2® that have been adapted and optimized for skin feature segmentation. 16,17,22To concurrently analyze multiple skin components, a modified form of U-net with reduced dimensions was employed.Specifically, the channel count of each block was reduced by 1/4, resulting in a model with a 1/16 smaller memory size.Each model was configured to perform segmentation on skin components with similar representation patterns at the same time.For instance, both wrinkles and pores were simultaneously detected, and the input images were cropped to a size of 768 × 640.MSE loss with L2 regularizer was employed for training.The learning rate was set at 0.0001, and the optimizer used was Adam. Figure 5 shows an example of segmentation results obtained from the skin analyzer used in this study.Figure 5A shows the original frontal face image, and where the ratio has values between 0 and 1.Then, the ratio value is converted to the score based on predetermined thresholds according to the distribution obtained from thousands of result images.
We used it to observe acne or freckles.The score network adopts a fully connected network structure with a bottleneck architecture to reduce model capacity.The score network is suitable for cases where the skin feature exists on all individuals and its morphology is an important factor for grading.For instance, it should be considered that a score for enlarged and elongated pores with a larger radius is more severe, even if there are fewer of them, compared to cases with numerous small and closely spaced pores.
We evaluated the performance of these models using the Table 3 shows the resulting performances of our skin analyzer models used in these experiments.Since a correlation exceeding 0.7 is considered a strong correlation, we can infer that our skin analyzer performs similarly to skin experts.

| Ingredient analyzer
The ingredient analyzer performs the task of finding suitable cosmetic effects by analyzing cosmetic ingredients.We regarded (1) Short-Term Memory (LSTM), 23 Gated Recurrent Units (GRUs), 24 and other variants of Recurrent Neural Networks (RNNs), complex patterns and dependencies of sequential data have been analyzed.
Recently, transformer models have gained popularity due to their We implemented transformer encoder layers based on the BERT 27 model, as similar approaches have proven their effectiveness in analyzing sequential data across various domains. 28,29We used six transformer encoder layers, which consist of two multihead attention blocks and feed-forward blocks with normalization.
The attention block enables the model to assign higher scores to relevant ingredients in the list, enabling the model to analyze sequentially written ingredients.By attaching three fully connected layers and a sigmoid function after the transformer encoder, the model can perform multi-label classification by calculating the probabilities of each efficacy.We used weighted binary cross-entropy loss to diminish the effects of class imbalances by utilizing the Adam optimizer.
Our transformer architecture is illustrated in Figure 6B.

| Matching strategy
The results of the skin analyzer and ingredient analyzer are presented as vectors, and the recommendation module utilizes these two vectors to identify suitable products.Scores from the skin analyzer range from a minimum of 1.1 to a maximum of 9.9.These are then converted to reversed normalized values to align with the results from the ingredient analyzer.As a result, both the skin score vector and the product efficacy probability vector contain values ranging from 0 to 1 across six skin concerns.In the score vector, a higher value denotes a severe skin issue, while the probabilities in the ingredient analyzer's result vector indicate whether the product has positive effects on each specific skin problem.The recommendation module calculates the similarity of these two vectors and selects the top N products based on these similarity scores.We used cosine similarity to match the user and product vectors, which calculates the angle of two vectors to get a degree of similarity between them.The equation of cosine similarity is shown as follows: where S denotes the reversed normalized score vector from the skin analyzer, and I denotes the probability vector of efficacy from the ingredient analyzer.For operation, • represents the inner product and || • || means the magnitude of a vector.
In addition, we filtered products based on skin types.If a user's skin type is not compatible with a product's suitable skin type, that product is removed from the recommendation list.We defined 12 skin types determined through user surveys.The list of skin types is oily-resistant (OR), normal-resistant (NR), dry-resistant (DR), oily-sensitive (OS), normal-sensitive (NS), dry-sensitive (DS), oily-resistant-allergic (OR-A), normal-resistant-allergic (NR-A), dry-resistant-allergic (DR-A), oily-sensitive-allergic (OS-A), normalsensitive-allergic (NS-A), and dry-sensitive-allergic (DS-A).Our focus is primarily on the oiliness, dryness, and sensitivity of the skin, as actual skin problems are assessed by the skin analyzer.Figure 7A illustrates how to match a user to a product based on skin and ingredient analysis, while Figure 7B shows the product recommendation process based on similarity scores and skin types.

| RE SULTS
In this section, we present the performance of the ingredient analyzers and show the results of the proposed recommendation system on images from multiple face image datasets.Tested skin images were selected from LUMINI Kiosk V2® and two public face image databases.

| Ingredient analyzer
We have conducted tests on the ingredient analyzer.We evaluated the analyzer to determine whether it can predict the effects of products from ingredient lists and compared the results to the actual efficacy of cosmetic products.
Table 4 shows several examples of the test results.The ingredient list for product A exclusively includes ascorbic acids.This compound is widely recognized for its ability to brighten the skin and offer antiaging benefits, traits that are well reflected in the product's marketing tag.The ingredient analyzer predicts with 99.8% confidence that it is beneficial for pigmentation and with 20.7% confidence for the However, as seen in the case of product C, sometimes it does not perform well enough to predict cosmetic efficacy.
To evaluate the overall performance of the ingredient analyzer and to assess its generalization ability, we conducted a four-fold crossvalidation test on the model.Table 5 shows the performance of the cosmetic ingredient analyzer.We achieved approximately 90% accuracy, precision, recall, and F1 scores on the training set.On the other hand, the validation set has about 84% accuracy and about 58% performance on the precision, recall, and F1 scores.These results indicate that the ingredient analyzer estimates cosmetic effects well and could be utilized for predicting a product's efficacy solely based on its ingredient list.In addition, to verify the generalization capability, we tested our model on a new dataset, which contains 286 cosmetic products.
The model has about 81.7% accuracy, 44.6% precision, 47.3% recall, and 43.6% F1-score on the test dataset.We evaluated a similar test on the pretrained BERT model with fine-tuning on our dataset, which produced a similar result on the same evaluation metrics.

| Recommendation
We have validated the recommendation results across multiple user cases.These tests were conducted using frontal face images taken from LUMINI Kiosk V2®, and from the public datasets, FFHQ 30 and CelebA-HQ. 31The images of the subject captured with LUMINI kiosk were collected along with informed consent.In this evaluation, we excluded dark circles from consideration due to their limited representation in our cosmetic test dataset.are results of the FFHQ and CelebA-HQ datasets, respectively.The images were selected to represent diverse facial skin conditions.For five cosmetic categories: cleanser, toners, moisturizers, serums, and special cares, we selected the product with the highest user-product similarity score for each category, calculated using the user score vector and the ingredient probability vector.
In Figure 8A, the recommendation system selected suitable products for a user who has low acne and pigmentation scores.The marketing tags supports that this product is effective for acne and brightening.In Figure 8B, the skin concerns about wrinkles and pores are observed, and the system recommends the cleanser for pore reduction and the moisturizer and serum for wrinkle management.
The performance of the recommendation system has also been evaluated on the FFHQ and CelebA-HQ datasets.Experiments on The skin score vector from the skin analyzer is transformed into a normalized score vector, and a score vector is matched to the efficacy probability vector of each product.(B) The recommendation process from a user and product similarities.
the public dataset had restrictions compared to the LUMINI Kiosk V2®.For public datasets, we did not apply filters by skin types since surveys were not possible.Figure 9 shows the results using the FFHQ dataset.In Figure 9A, the system suggests products that are beneficial for pores and wrinkles, for which relative low skin scores are assigned.In Figure 9B, products for anti-inflammatory and depigmentation were recommended, also related to low skin scores.
Figure 10 represents the results of the CelebA-HQ dataset.In the case of CelebA-HQ, although it does not provide high enough resolution to assess pores, it demonstrates sufficient analytical performance for other skin concerns.In Figure 10A, the recommendation system proposes appropriate cosmetics to take care of aging and inflammatory issues.This is because a low wrinkle score implies aging, and low redness indicates trouble.Similar results were also obtained from Figure 10B, because the scores of wrinkles and redness were also low.

| DISCUSS ION
In our system, customers can predict cosmetic effects through ingredient analysis and assess their skin status using AI skin analysis.These domain-specific analyses provide more personalized and analytical information about users' skin and cosmetics, which helps in finding better-suited products than conventional recommendation approaches.While there are existing recommendation systems that have attempted to use AI for skin analysis and cosmetic ingredient analysis, our proposed method differs from them.
For example, Li et al. 8 applied a deep learning-based skin analysis using YOLOv4 to detect skin problems and determine product efficacy by directly correlating it with ingredient efficacy.Chaurasia et al. 32 identified product features from analyzing cosmetic ingredients using t-SNE and machine learning method.However, these systems did not quantitatively analyze their performances using proper evaluation metrics, nor did they consider the order of ingredients in cosmetic analysis.In contrast, our system features specialized AI models for each skin problem and, instead of using the efficacy of ingredients as a direct measure of product efficacy, analyzes cosmetic data to estimate a product's efficacy based on its ingredients while considering the order of their compositions.
To enhance the generalization of our system, we are currently developing new AI models to analyze a broader range of skin problems, including skin firmness, radiance, cancer, and textures.We anticipate that this expanded research and development will mitigate potential biases in our skin analyzer, which has been focused on specific skin problems.In addition, we will continuously update our datasets to improve our ingredient analyzer.Due to the fact that transformer models have a strong dependency on the number of datasets, we believe adding additional cosmetic datasets will improve the model performance and reduce potential biases.Consequently, we expect that cosmetic ingredient analysis and skin analysis by artificial intelligence will provide more objective results on the cosmetic effects and offer personalized cosmetics suitable for the user's skin status.
The current area for improvement in our system lies in the relatively low precision, recall, and F1 scores of the ingredient analyzer.This is largely due to the imbalanced class distribution in our cosmetic dataset, which prevents the model from predicting well for certain classes with few samples.For instance, as shown in Figure 4, our dataset contains more cosmetic products for anti-aging or depigmentation than for anti-acne or dark circles.In order to delve deeper into the impact of imbalanced data, we conducted experiments to determine which specific minority class is most affected.Upon analyzing the performance of each skin category, we observed that while the accuracy for acne was 97.2%, the precision and recall fell below averages at 33.

F I G U R E 8
Recommendation results for LUMINI Kiosk V2® skin images.User skins are analyzed by the skin analyzer, and the products are recommended based on the similarities between the user scores vector and the ingredient analysis probability vector.User (A) had skin problems with pigmentation and acne, and user (B) had skin problems with pores and wrinkles.The system recommends products for five categories, and recommended products have marketing tags that are similar to skin problems.

F I G U R E 9
Recommendation results on the public FFHQ dataset.User (A) had relatively low skin scores on pores, acne, and wrinkles, and user (B) had skin problems on redness and pigmentation.The system recommended products that had efficacy for each user's skin problems.
F I G U R E 1 0 Recommendation results on the public CelebA-HQ dataset.User (A) has skin problems with wrinkles and redness.User (B) has these problems as well.The products recommended primarily have anti-inflammatory and anti-aging properties, making them suitable for the care of wrinkles and redness.
may overlook some of the efficacy or mistakenly judge that a product contains improper efficacy when it does not.Although several tests on various datasets, including FFHQ and CelebA-HQ, demonstrate the system's reliability, these uncertain cases could potentially lead to customer skepticism and adversely affect their perception of the system's reliability.Addressing this issue primarily involves expanding the dataset with a more diverse range of cosmetic samples, which we aim to continuously update and improve.
In addition, the ground truth in our cosmetic dataset is largely derived from marketing tags on the products.Although the ground truth was collected from sources after their efficacy and components were certified by the regulations of the Ministry of Food and Drug Safety of the Republic of Korea (MFDS) and the FDA, there might remain a minor influence of the manufacturers' marketing strategies.To achieve a more accurate and objective result, we plan to gather actual test reports of cosmetics directly from clinical testing centers in our future work.This approach will allow us to obtain precise enhancement rates for each skin concern.
To make our recommendations more practical, it may be beneficial to include considerations for individualized treatment, which incorporate factors such as age, gender, and hormonal changes.
Skin attributes such as wrinkles and pigmentation vary based on age and gender, and the frequency of acne occurrences could fluctuate with hormonal cycles.Such individual information can be collected through surveys incorporated into the system, and they can be utilized as evidence for treatments along with the analysis results.Additional surveys may be considered for users who conducted nonsurgical facial rejuvenation.This is crucial because the skin is in a recovery phase after the procedure, making it potentially different from the usual condition of the skin.For example, after procedures that may cause surface wounds on the skin, such as TCA peeling or laser treatments, it is advisable to avoid alcoholbased products and instead use regenerating creams.By incorporating details about such nonsurgical procedures and their history into surveys, users can receive a safer and more personalized recommendation service.

| CON CLUS ION
In this study, we have presented a new approach for analyzing cosmetic ingredients and a novel recommendation system combined with skin analysis.We have demonstrated that we can infer the effectiveness of skin care products through ingredient analysis using deep learning.Additionally, by integrating skin analysis, we have shown that users can receive personalized cosmetic recommendations tailored to their specific skin condition.The limitations of our model, which relies on the transformer, stem from the insufficient quantity of reliable data needed to fully harness the transformer's capabilities.
Furthermore, the model's performance and generalization are hindered by the restricted number of analyzed skin conditions in our skin analysis model.Our future work focuses on continually gathering validated databases from reputable institutions to enhance the completeness of the recommended system proposed in this study.
We also plan to augment the system by incorporating additional skin analysis items, including elasticity and radiance.

F I G U R E 1
Appearance of skin concerns.F I G U R E 2 Baumann skin types.She identified human skin type as 16 types which defines skin from combination of dry (D) or oily (O), resistant (R) or sensitive (S), pigmentation (P) or nonpigmentation (N), and tight (T) or wrinkled (W).

F I G U R E 3
An example of ingredients list on the cosmetic products.Ingredients are listed in order of contents.F I G U R E 4 Class distribution of the cosmetic dataset. of samples in each class and used a weighted binary cross entropy loss as follows:

Figure
Figure 5B displays the resulting images with each skin concern overlayed on different ROIs.To assess skin status, we used a method utilizing deep learning to segment the skin concern in the image and then assign a grade based on the segmented results, as shown in Figure 6A.To infer skin scores from segmentation results, we incorporated two approaches.One is the pixel count-based ratio scoring method, and the other is the score learning network based on expert evaluations.Ratio-driven score estimation calculates the ratio of the number of segmented pixels over ROI, which is calculated as follows: Pearson correlation coefficient, comparing our AI model estimated scores to score labels assigned by dermatologists like Equation (4).Two hundred different frontal face photos were selected for each skin component, and experts assigned scores ranging from one to five based on the severity of the skin component.The Pearson correlation coefficient is represented as follows: where   is the Pearson correlation coefficient between sample  and .Here,  is the sample size,   and   are the individual sample points indexed with , and  ̄ and  ̄ represent the means of the samples x and y, respectively.
Cosmetic effects, which are classified into 18 classes.Classes are getting from the marketing tags on the skincare products.Eighteen Cosmetic effects from marketing tags Appearance of wrinkles, lightening, anti-aging, sebum control, moisturizing, skin barrier, antioxidants, anti-inflammatory, antiacne, smoothing, firming, pore reducing, exfoliation, brightening, sunscreen, PH balance, dark circles, nutrishing TA B L E 2 Cosmetic effects are classified into six classes.Classes are mapped from marketing tags to match the skin analyzer.Six Cosmetic effects from skin analyzer Pores, redness, acne, wrinkles, pigmentation, dark circles cosmetic ingredients as a kind of sequential data based on their notation rules.There have been a lot of studies to capture patterns from sequential information, and most of them aim to not only capture the characteristics of each instance, but also its correlations with other instances.Stochastic approaches based on the Markov process have been widely used since the early stages of such studies.Furthermore, with the advent of deep neural networks like Long

F I G U R E 5 F I G U R E 6
Skin image segmentation and heat maps for six skin concerns.(A) is an original skin image, and (B) shows heat maps of pores, redness, acne, wrinkles, pigmentations, and dark circles from the upper left position.The original facial skin image was taken from the FFHQ dataset.The structure of the recommendation system (A) and ingredient analyzer (B).(A) includes a skin analyzer, an ingredient analyzer, and a recommendation module.(B) illustrates the ingredient analyzer model architecture.| 2071 LEE et al. potent ability to capture long-range dependencies, efficient parallelization, and superior understanding of sequence context.In this study, we utilized the transformer architecture to capture features from the ingredients and attached a multi-label classifier to effectively predict the effects of cosmetics on various skin concerns.To utilize the raw text of ingredients in a transformer model, the model needs to vectorize them into numerical vectors.Typically, methods such as Word2Vec25 or Glove,26 which are pretrained on large corpora, are used for text vectorization.However, cosmetic ingredients often use unique words specific to their chemical composition, which are not commonly found in general documents.As a result, we created a tokenizer using a cosmetic ingredient dictionary.Each ingredient in the tokenizer is assigned a unique index to differentiate it from others, and the ingredient lists in a product are converted into a list of these indices.To ensure consistency in the size of ingredient vectors, any additional spaces are padded with zeros.Additionally, to provide positional information to the transformer encoder, we appended a positional embedding layer before inserting data into the model.
wrinkles.Product B, which contains caprylic/capric triglyceride, silica, polyisobutene, glycine soja (soybean) oil, tocopheryl acetate, retinol, and BHT, is marketed as beneficial for wrinkles and firmness.The ingredient analyzer predicts this product is good for wrinkles with 87.3% confidence, 73.7% pigmentation, and 73.2% redness.These results demonstrate that the model can suitably predict a product's efficacy from a sequentially arranged ingredient list.

Figure 8
represents recommendation results for LUMINI Kiosk V2®, and Figures 9 and 10 3 and 14.3, respectively.This discrepancy can be attributed to the fact that out of the total 8759 ground truth labels, only 161 prod-Example results of ingredient analysis.Product A is marketed for brightening and wrinkles, Product B for wrinkles and firmness, and Product C for anti-aging.The ingredient analyzer predicts similar efficacy as the marketing tags.
ucts were identified as acne, indicating a bias in the model's training.In contrast, for wrinkles (with 2648 products) and redness (with 3207 products), we obtained accuracy scores of 76.2 and 64.3, along with precision values of 60.6 and 61.2, and recall values of 67.4 and 50.4 respectively.This result might reduce user trust in the ingredient analysis.There is a possibility that the recommendation system TA B L E 4