Revolutionizing Infection Risk Scoring: An Ensemble “From Weak to Strong” Deduction Strategy and Enhanced Point‐of‐Care Testing Tools

The COVID‐19 pandemic exacerbates challenges faced by human immunodeficiency virus patients, who are at heightened risk for infection due to compromised immune systems. This study aims to develop a reliable, home‐based point‐of‐care testing (POCT) tool for early screening of acquired immunodeficiency syndrome (AIDS) coinfected with Talaromyces marneffei infection. Employing a “From weak to strong” deduction strategy for feature selection, data from 464 AIDS patients across four cohorts between February 5th, 2014, and January 8th, 2022, are analyzed. The top three features consistently observed are D‐dimer, cluster of differentiation 4+, and aspartate transaminase. Based on these features, the simplest risk‐scoring model is constructed, with the area under the receiver operating characteristic curve values of 0.91, 0.80, and 0.69 in the hold‐out cohort, external cohort 1, and external cohort 2, respectively. This “From weak to strong” deduction strategy identifies advantageous clinical features, enabling the development of simplified clinical risk scores with multiple biomarkers. To facilitate practical implementation, enhanced POCT tools are introduced, specifically a strip with segmented testing capabilities that demonstrates sensitivity and strong correlation with clinical scoring models. Furthermore, an open‐access website and a free Android mobile app are created to support community utilization. The findings underscore the effectiveness of the innovative deduction strategy and enhanced test strips, which enable bedside measurements without laboratory dependency.


Introduction
The COVID-19 pandemic has imposed unprecedented impediments on clinical diagnostics and therapeutic paradigms, attributable to its enhanced transmissibility and diverse symptomatology. [1]This scenario intensifies the quandary faced by patients with common infectious diseases competing for healthcare resources and also amplifies the risk of nosocomial infections. [2][5] Consequently, the construction of reliable, portable pointof-care testing (POCT) devices for home use is paramount in mitigating infection risks within outpatient care settings.
However, the progress of POCT technologies relies on the presence of specific antigen-antibody markers, as demonstrated by COVID-19 and syphilis POCT assays. [6,7]Regrettably, not all infectious The COVID-19 pandemic exacerbates challenges faced by human immunodeficiency virus patients, who are at heightened risk for infection due to compromised immune systems.This study aims to develop a reliable, home-based point-of-care testing (POCT) tool for early screening of acquired immunodeficiency syndrome (AIDS) coinfected with Talaromyces marneffei infection.Employing a "From weak to strong" deduction strategy for feature selection, data from 464 AIDS patients across four cohorts between February 5th, 2014, and January 8th, 2022, are analyzed.The top three features consistently observed are D-dimer, cluster of differentiation 4þ, and aspartate transaminase.Based on these features, the simplest risk-scoring model is constructed, with the area under the receiver operating characteristic curve values of 0.91, 0.80, and 0.69 in the hold-out cohort, external cohort 1, and external cohort 2, respectively.This "From weak to strong" deduction strategy identifies advantageous clinical features, enabling the development of simplified clinical risk scores with multiple biomarkers.To facilitate practical implementation, enhanced POCT tools are introduced, specifically a strip with segmented testing capabilities that demonstrates sensitivity and strong correlation with clinical scoring models.
Furthermore, an open-access website and a free Android mobile app are created to support community utilization.The findings underscore the effectiveness of the innovative deduction strategy and enhanced test strips, which enable bedside measurements without laboratory dependency.
diseases exhibit easily identifiable, unambiguous antigenantibody markers.In HIV-related infections caused by Talaromyces marneffei, the quantity of relevant antigens in the bloodstream of HIV patients is strikingly low due to weakened immune function, thereby hindering the production of swift POCT screening tools. [8]Moreover, Talaromyces marneffei, a dimorphic fungus, presents nonspecific clinical signs, often manifesting as rashes, respiratory symptoms, fever, anemia, and other systemic signals.10] A multitude of severe illnesses face similar obstacles, such as nonspecific clinical symptoms and challenges in obtaining diagnostic gold standards, as noted in sepsis, which typically employs the SOFA and APACHE IV scoring systems for screening and diagnosis. [10,11]As a result, the creation of a disease screening method that is not dependent on specific antigenantibodies and is applicable to the early detection of critical illnesses, carries significant clinical importance and the possibility for broad application.
The core objective of devising an early screening tool for diseases that avoids reliance on specific antigen-antibodies is to identify a set of biomarkers that demonstrate high sensitivity and disease specificity.14][15][16] Nonetheless, a large proportion of studies are devoid of sturdy guidelines for choosing the most suitable data method for their datasets, with selection outcomes often dependent on the performance of different methods on a single dataset.This introduces a degree of randomness into the final biomarker selection. [17,18]o lessen the irregularity in such research findings, our study suggests a more equitable feature selection process, subsequently quantifying the selected features for integration into POCT assays.
Present clinical point-of-care testing (POCT) strips primarily use colloidal gold assays, concentrating on specific antigenantibody interactions, and are therefore designed for qualitative responses. [19]However, our ambition is to construct a diseasescreening approach that overcomes dependence on specific antigen-antibody interactions, incorporating quantitative assessments of clinical scores and disease-related indices.Basic qualitative classification outcomes are insufficient to meet our needs.As such, we have devised a series of colloidal gold assay strips capable of semiquantitative measurements, allowing their smooth integration with clinical scoring systems. [17]mploying patients with Talaromyces marneffei infections within the HIV-infected population as a multicenter cohort study, our research suggests a selection strategy for nonspecific antigenantibody disease biomarkers.Through implementing a "from weak to strong" feature selection approach, we have mitigated biases arising from different selection methods and innovatively applied the resultant selections to the fabrication of semiquantitative colloidal gold assay strips.This integration enables the combination of POCT detection and clinical scoring.Additionally, we have established a dedicated website and an efficient mobile app for AIDS patients and healthcare professionals to avail of early screening for Talaromyces marneffei infections at no charge.Patient selection and distribution among the cohorts were meticulously designed to optimize model performance and provide unbiased validation, ultimately bolstering the credibility of the study's conclusions.A set of four data distributions' heatmaps is elegantly depicted in Figure S2a-d, Supporting Information, while the precise data tables and intergroup statistical tests can be found in Table S1, Supporting Information.The interrelation among the attributes within each cohort is discernibly illustrated in Figure S2e, Supporting Information.The overall workflow and pipeline of the experiments are illustrated in Figure S3, Supporting Information.

Classical Standardized Logistic Regression
After implementing Z-standardization, we conducted univariate regression analyses on all 32 features within the logistic regression model to pinpoint variables exhibiting significant p-values.Subsequently, we carried out multivariate regression analyses on these variables utilizing a forward selection method.The performance at each stage of the stepwise forward logistic regression is aptly demonstrated in Figure 1a.Results from both univariate and multivariate regression analyses can be found in the Table S2, Supporting Information.
A ranking of characteristics, based on the optimal model's standardized logistic regression coefficients (β), is vividly portrayed in Figure 1b.The receiver operating characteristic (ROC) curves for the four cohorts, derived from the optimal standardized logistic regression model, are displayed in Figure 1c.

Shapley Additive Explanations and Permutation Importance of Machine Learning Models
The 12 machine learning models underwent a thorough process of 50 iterations of grid hyperparameter tuning.Based on the optimal parameters derived from this tuning, the models were subsequently trained on the primary cohort.The area under the curve (AUC) results for the 50 bootstrap resampling iterations are detailed in the supplementary material, specifically Table S3, Supporting Information, along with the optimal hyperparameter combinations for the models in Table S4, Supporting Information.The receiver operating characteristic (ROC) curve outcomes for the three test cohorts using the 12 machine learning models are provided in Figure S4, Supporting Information.
Utilizing the 12 optimally tuned machine learning models, we assessed and ranked the relative importance of all 32 predictor variables in deductive Sections 2 and 3.This evaluation employed the SHapley Additive exPlanations (SHAP) and permutation importance (PI) methodologies.The specific results for SHAP and PI can be inspected in Figures 2 and 3. Within each figure, subplot a presents the feature importance results for the 32 characteristics across the 12 machine learning models under both SHAP and PI frameworks; subplot b showcases the structural diagrams for SHAP and PI; subplot c illustrates the fluctuation of feature importance relative to the overall average level after adjusting the feature importance results of the 12 machine learning models under SHAP and PI frameworks by external cohort 1; and subplot d portrays the variations in the importance ranking of the final three features among the 32 features across the 12 machine learning models under both SHAP and PI frameworks.
We executed a comprehensive evaluation of the three independent deductions, revealing that D-dimer, CD4þ, and rash consistently ranked 1st, 2nd, 3rd, or 4th across the three deductions, thereby, exhibiting high importance regardless of the data fitting method employed.For instance, D-dimer accounted for 28.3%, 23.8%, and 24.5% of the total proportion of the top seven features in standardized logistic regression, SHAP, and PI, respectively, as depicted in Figure 4d-f.D-dimer, CD4þ, and rash together constituted 68.8%, 28.8%, and 27.7% of the overall feature importance in standardized logistic regression, SHAP, and PI, respectively, as illustrated in Figure 4g.
Consequently, we selected these three features to streamline our model.We reimplemented logistic regression using these three features in the development cohort and tested it on the validation, external cohort 1, and external cohort 2 datasets.The acquired AUC values were 0.883, 0.822, and 0.728, signifying that the discriminative prowess of this streamlined model was not markedly inferior to the 32-feature model.The specific results can be examined in Figure 4h.Therefore, developing the most streamlined scoring system employing these three features represents a robust approach.

The Establishment of the Simplest Scoring System and the Evaluation of System Performance
To devise a robust risk scoring system founded on the initial three characteristics, we harnessed the prowess of logistic regression models' beta coefficients.[22][23] Our risk scoring system manifested admirable discriminatory acumen across diverse datasets, boasting AUC values of 0.91, 0.80, and 0.69 in the hold-out cohort, external cohort 1, and external cohort 2, respectively, as delineated in Figure 5h.
Furthermore, we conducted a meticulous examination of all threshold points to ascertain the model's precision, sensitivity, and specificity.These findings can be scrutinized in Figure 5a-e, while the data are accessible in Table S5, Supporting Information.To validate our scoring system's distinctiveness within the data, we implemented unsupervised clustering through the K-means algorithm on the entire cohort of 464 patients, opting for the optimal cluster number k = 2 as recommended by the elbow plot.Our analysis unveiled statistically significant disparities between the scores of the autonomously clustered group.Detailed outcomes can be perused in the f and g subplots within Figure 5, and the elbow plot results are available in Figure S5, Supporting Information.

The Reformation of POCT Test Strips
Our ingeniously devised POCT test strip reflects concentrations of D-dimer and CD4þ through the number of red T-lines displayed on each strip, while the presence or absence of a rash can be ascertained via visual examination.This POCT test strip empowers clinicians and home-based patients to directly obtain a risk score for Talaromyces marneffei infection.Patients amassing scores of 7 or above are deemed high-risk candidates for Talaromyces marneffei infection.This design expedites prompt bedside and community-based testing.The application protocol of the test suite, along with the completed prototype experiments, is depicted in Figure 6.
Figure 6a elucidates the workflow of the POCT test strip's clinical application, Figure 6b exhibits the colorimetric scoring interpretation methodology for the test strip, Figure 6c unveils the fundamental principle of the POCT test strip, and Figure 6d,e portrays the intermediary procedures and outcomes of our test strip fabrication.The specific chemical manufacturing process can be accessed in the supplementary material.

The Establishment of a Free Internet Platform
In a bid to aid clinicians and patients in making well-informed diagnostic assessments, a gratuitous open platform has been meticulously constructed, which can be accessed at www.aoidsdiagnosis.com,as exhibited in Figure 7b.This considerately designed website enables both physicians and patients to input relevant clinical data, from which it subsequently yields reliable forecasts.Concurrently, we have crafted a mobile application for the Android platform, ensuring that users can access the system irrespective of location, as demonstrated in Figure 7c. Figure 7a outlines the exhaustive development process of the website, application, and accompanying software.

Discussion
Dimorphic fungi, distinguished by their capacity to effectuate morphological transitions in response to environmental stimuli, are essential facilitators of adaptation throughout their life cycles. [24]They transform from filamentous structures to yeast-like forms within host organisms, thus initiating pathogenic infections.The complex morphological transformations exhibited by fungi such as Histoplasma, Talaromyces marneffei, and Sporothrix, pose substantial challenges to the host immune system.This complexity not only complicates the diagnosis and treatment of infections but also significantly hinders accurate and efficient therapeutic intervention. [24]Talaromyces marneffei epitomizes this group of fungi, demonstrating intricate morphological changes, immunological hurdles, diverse infection sites, and a broad range of infection severity, leading to a myriad of atypical clinical manifestations that obstruct precise diagnosis and effective therapy. [8] the present investigation, we employed an amalgamation of conventional statistical methods, machine learning techniques, advanced biomarker screening, clinical scoring, and point-of-care testing (POCT) approaches to formulate an exhaustive early screening tool development process.This process was specifically designed for AIDS patients during the COVID-19 pandemic, with the goal of facilitating community and home-based screening.
In clinical settings, when procuring gold standard pathogen cultures is unfeasible, the identification of pertinent biomarkers for infection screening becomes paramount.However, differentiating high-diagnostic-value features from a multitude of complex and variable characteristics is a significant challenge.Predominant feature selection methods include logistic regression and machine learning. [25,26]Logistic regression, given its superior interpretability, is frequently utilized in the creation of clinical diagnostic models.Nonetheless, its linear fitting model might overlook nonlinear and fundamental latent features. [27]In contrast, machine learning provides flexibility and adaptability in feature selection, exploring optimal fits through a diverse array of algorithms. [25]However, during the development of machine learning models, feature weights for a particular disease often exhibit variability depending on the algorithms utilized.Moreover, different feature weight calculation methods within a single algorithm can lead to divergent rankings of feature importance.This inconsistency necessitates attention in current research, as the lack of standardization and distinct outcomes arising from varying algorithms and feature importance ranking methods compromise the reliability of these models.
To overcome these challenges, we propose an integrative strategy that combines logistic regression with multiple machine learning models, allowing for the ranking of feature importance from multiple perspectives.Our method analyzes feature importance through three separate aspects: standardized logistic regression coefficients, SHapley Additive exPlanations (SHAP), and permutation importance (PI).This "from weak to strong" inductive process ultimately produces highly consistent results.
In logistic regression, the regression coefficients indicate the extent to which a feature influences the dependent variable.However, direct comparisons between unique features' regression coefficients are confounded by unit effects. [28,29]To mitigate this issue, we utilize standardized regression coefficients, derived from standardizing both the features and the dependent variable.This standardization process eliminates disparities in magnitude and scale, thereby facilitating the examination of weight magnitudes across various variables. [26]urthermore, specific machine learning models typically possess their own feature weight analysis methods, such as the coefficients of Support Vector Machine (SVM) class models and feature importance of tree models like Random Forest.Nonetheless, these methods are primarily designed for internal model analysis and prove inadequate for comparisons between distinct algorithms.Directly comparing various algorithm models using disparate weight analysis methods risks introducing biases into the feature importance analysis. [20,21,29]To rectify this issue, we employ two external perspective feature evaluation techniques: SHAP and permutation importance.The SHAP framework serves as an interpretive tool for machine learning model predictions, ascertaining the contributions of individual input features to prediction outcomes based on the Shapley value concept from cooperative game theory and providing an inclusive explanation of the model's decision-making process. [29,30]ermutation importance, a model-agnostic method, calculates feature importance by assessing the extent to which performance scores decrease when features are randomly permuted. [31]By implementing these two independent feature evaluation methods, we attain a unified perspective for evaluating feature importance across a multitude of models.
The combination of classical statistics, machine learning algorithms, and the "from weak to strong" inductive process resulted in findings that highlight D-dimer, CD4 þ T-cell count, and rash as the most significant features in both development and validation sets.These three features received weight contributions of 68.8%, 28.8%, and 27.7% from standardized, PI, and SHAP evaluation methods, respectively.Ultimately, the simplest scoring system attained 0.91, 0.80, and 0.69 AUROC in hold-out cohort, external cohort 1, and external cohort 2, thus corroborating the reliability and stability of our feature selection approach.
CD4 cells perform a pivotal role within the human immune system.Gradual or irregular decreases in CD4 cells among HIV-infected individuals denote serious immune system impairment. [21]When CD4 cell counts fall below 200 cells mm À3 , numerous opportunistic infections or tumors can arise, particularly in HIV-infected patients with CD4 cell counts below 100 cells mm À3 , where T. marneffei activation becomes highly probable. [20]n addition, T. marneffei frequently compromises blood vessels, creating fungal emboli, and disseminating throughout the body.D-dimer, a product of cross-linked fibrin clot dissolution by plasmin, primarily reflects fibrinolytic function.It can assist in predicting thrombosis, infection, tissue necrosis, and AIDS-related complications, and to a certain degree, disease progression and mortality rates. [28,32]Simultaneously, T. marneffei infection can provoke inflammatory responses in the skin and soft tissues, resulting in the development of rashes.Rashes often present as erythema, papules, blisters, or ulcers and may be accompanied by itching or pain.In cases of AIDS coinfected with T. marneffei, the presence of a rash may suggest localized infection dissemination and deep tissue involvement.Therefore, monitoring rash occurrence is vital for screening AIDS patients coinfected with T. marneffei. [21]ased on this feature consensus, to circumvent complex model computations, the study transforms the features into a clinical scoring system, enabling calculation by healthcare professionals and promoting early screening indicator dissemination.However, as the serum markers incorporated in this model necessitate laboratory testing, the required facilities and testing times preclude rapid home testing for HIV patients.By determining D-dimer and CD4 cell concentrations, we convert existing coronavirus POCT test strips from qualitative to semiquantitative testing.A scoring model directly related to semiquantitative testing estimates the risk of T. marneffei infection in HIV patients based on test outcomes.POCT test strip-displayed scores can indicate whether an individual is part of a high-risk group for T. marneffei infection.This semiquantitative POCT test strip, in conjunction with clinical scoring incorporating serological markers (e.g., Alvarado score, International Prognostic Index (IPI), Systemic Inflammatory Response Syndrome (SIRS) score, etc.), enables swift disease risk assessment without dependence on laboratory testing.This innovative development holds the potential for extension to other studies.
Despite the promising results of this study, several limitations must be acknowledged.The model, founded on a retrospective cohort, specifically targets HIV-infected patients in China, potentially resulting in variations in disease prevalence and clinical characteristics in other countries.Further research is required to establish the generalizability of our findings to diverse populations and to validate the model's performance in a prospective setting.
In conclusion, our research presents a novel risk assessment system, developed through a feature selection strategy termed "from weak to strong" deduction.The performance and generalizability of our model were evaluated using multicenter cohorts, revealing promising outcomes for the early diagnosis of opportunistic Talaromyces marneffei infections in AIDS patients.By incorporating logistic regression and machine learning models, our methodology streamlines biomarker screening "from weak to strong" and facilitates the development of a composite biomarker scoring system.Ultimately, to foster community outreach, we efficiently integrated the clinical scoring system into semiquantitative test strips, offering expansive potential applications across a range of clinical assessment domains.If the scores comprise segmented serological biomarkers, such test strips can be produced.

Experimental Section
Ethical Principles and Group Access Criteria: The current investigation adheres to the stipulations of the Declaration of Helsinki, obtaining ethical committee consent from Wenzhou Central Hospital (L2021-03-082), Hangzhou Xixi Hospital (2020 Ethics Approval No. 34), and the First Affiliated Hospital of Zhejiang University School of Medicine (2021-II T-599).All participating investigators signed agreements to maintain the confidentiality of medical information, thereby ensuring patient privacy.Personal data were anonymized through established deidentification protocols.The study's flowchart, detailed in the supplementary materials, demarcates the inclusion and exclusion criteria for participant selection and provides a comprehensive guide for the research process.
Diagnostic Criteria and Clinical Data Acquisition: Diagnoses of AIDS were in strict accordance with the 2021 edition of the World Health Organization (WHO) AIDS Diagnosis and Treatment Guidelines.Classifications and diagnoses of HIV opportunistic infections were performed following the recommendations of the Centers for Disease Control and Prevention (CDC).Patients in each cohort were stratified based on the presence or absence of a confirmed Talaromyces marneffei infection.The experimental group consisted of individuals with positive Talaromyces marneffei culture results.The standards for data management are outlined in the supplementary materials.Clinical data were systematically collected from the inpatient diagnosis and treatment system databases of each participating medical center.
A standard numerical form was designed to consolidate data regarding patient characteristics, including demographic factors (age and sex), clinical manifestations (fever, cough, rash, gastrointestinal hemorrhage), laboratory parameters, and radiological features.The data, collected from each medical center, were systematically entered into the standard form.
Statistical Analysis and Standardized Logistic Regression Coefficients: Data processing was conducted using Microsoft Excel 2019.To preserve data integrity, missing values for continuous variables in both positive and negative patients in the development cohort were filled using imputation via the series mean method in Statistical Package for the Social Sciences (SPSS) (version 24.0, IBM).The compositional differences of variables between the experimental and control groups were investigated using the Mann-Whitney U test for continuous variables and the Pearson chi-squared test for categorical variables.Univariate and multivariate logistic regression analyses were executed in SPSS, utilizing forward selection for feature selection and the likelihood ratio test for model fit assessment.Two-tailed p-values were computed, with a statistical significance threshold of 0.05.Z-standardization was implemented to normalize discrepancies in dimensions and scales, thereby facilitating the comparison of diverse variables.The standardized regression coefficients, or beta coefficients, were generated by the logistic regression equation.
Machine Learning Models and Feature Importance Ranking Methodologies: To address potential bias from dataset-specific characteristics during feature selection with a single machine learning algorithm, 12 representative classifiers were integrated.Additionally, two external evaluation methods for assessing feature importance were adopted, thus providing a balanced evaluation of different machine learning selection outcomes.
The 12 classifiers comprised AdaBoost Classifier, Gradient Boosting Classifier, Random Forest Classifier, Decision Tree Classifier, Extra Tree Classifier, Stochastic Gradient Descent Classifier (SGDC), Passive Aggressive Classifier (PAC), Perceptron, Support Vector Classifier (SVC), Linear Support Vector Classifier (Linear-SVC), Bernoulli Naive Bayes Classifier, and Ridge Classifier.All algorithms were implemented using the Scikit-learn package (version 1.2.0) in Python 3.9.13.A five-fold cross-validation technique was applied to assess the model's performance.This cross-validation procedure was conducted 50 times to ensure robust estimates of model performance, with the random seed set to 1.A grid search method was employed to determine the optimal hyperparameter combination for the model.The area under the receiver operating characteristic curve (AUROC) of the cross-validated model served as the objective for hyperparameter optimization.
Two external feature importance ranking methods were employed: Shapley additive explanations (SHAP) and permutation importance (PI).SHAP, a method grounded in cooperative game theory, provided a measure of a feature's relative influence over the entire prediction process, represented by the mean SHAP value.Conversely, PI measured the contribution of a feature to the model prediction error by randomly permuting one feature while maintaining the order of all others.
To address the issue of differing scales across machine learning models, MinMaxScaler was applied to the SHAP and PI outputs of all models, constraining the range for each feature between 0 and 100.Scores for each feature across the models were subsequently computed and accumulated to derive an overall ranking for each feature.However, all information sources in this process originated solely from the development cohort and did not improve the model's performance on other cohort datasets.Therefore, a calibration method employing the AUROC performance of an independent external validation dataset (Xixi Hospital) was proposed.The original feature scores were multiplied by the AUROC index of the external dataset to derive the adjusted scores.This method uses the external dataset's AUROC to calibrate the model weights derived from the development cohort, thereby enhancing the generalizability of the final model.
From Weak to Strong Deduction and Feature Selection: To circumvent limitations and biases associated with feature selection via a single data-fitting method, a "From weak to strong" deduction-based feature selection strategy was devised, ensuring the stability of the final feature selection results across different fitting methods and data distributions.This strategy integrated feature selection outcomes from three independent derivation methods to obtain highly reliable and generalizable representative features.Initially, standardized regression coefficients were employed to rank feature importance within the logistic models.Subsequently, PI was utilized to rank feature importance within the machine learning models.Lastly, SHAP was employed to rank feature importance within the machine learning models.The ranking outcomes from these three methods were analyzed, deriving higher levels of evidence, and informing feature selection.
Simplified Model Construction: A streamlined clinical scoring model was devised based on the stable features identified through the selection process.To generate scores, each feature's beta coefficient value was divided by the smallest constant and rounded to the nearest whole number.This scoring methodology was then employed to predict the risk of Talaromyces marneffei infection and validated within an external validation cohort at the First Affiliated Hospital of Zhejiang University.
Chemicals and Materials: The absorbent pad used is an AN3 absorbent pad (0.6 mm thick), and both the sample pad and gold label pad consist of glass fiber membrane number 8,975 (0.5 mm thick).These materials were procured from Hangzhou Bulus Trading Co., Ltd (www.bulus.com.cn).The nitrocellulose membrane used is a pure nitrocellulose blotting membrane P/N 66 485 (0.22 μm thick), purchased from Nanyang Ruitai Biotechnology Co., Ltd.
Colloidal Gold Solution Preparation: An aqueous solution of chloroauric acid (HAuCl4) at 0.01% concentration was prepared using 200 mL of water, heated to boiling, and maintained for 3 min.An equimolar amount of 1% trisodium citrate was rapidly added to the solution, and the mixture was heated and stirred for an additional 30 min.Upon turning wine-red, the solution was purified, concentrated, and the gold nanoparticle solution was obtained through multiple centrifugation steps.
Colloidal Gold Modification with Antibodies: The colloidal gold solution was adjusted to pH 9.0 using a 0.2 mol L À1 Na 2 CO 3 solution, and 1 mL was transferred into a centrifuge tube.Concurrently, 0.1 mL of a 200 μg mL À1 antibody solution was added to the tube, and the mixture was agitated for 40 min.Subsequently, a 6% BSA solution was added to achieve a 1% concentration, and the mixture was shaken for an additional 20 min.The solution was then centrifuged at 12 000 rpm for 10 min, the supernatant was discarded, and the concentrate was resuspended in 0.1 mL of 0.01 mol L À1 Tris-HCl solution (containing 1% BSA and 1% sucrose).This resuspension produced the immunocolloidal gold solution.
Strip Preparation: The sample pad, gold label pad, and nitrocellulose membrane were pretreated by soaking in a solution of 1% BSA, 2% glucose, pH 7.2, 0.1 mol L À1 PBS buffer for 12 h.They were then dried in a 37 °C incubator for 12 h and set aside.
An antigen solution was prepared by combining 50 μL of antigen (Native Human D-Dimer protein ab35949-100ug) with 450 μL of 1% PBS to achieve a 1% concentration of Solution A. Solution B was prepared by combining 100 μL of Solution A with 300 μL of 1% PBS solution to create a 1.25% concentration.T1, T2, and T3 lines were drawn on the pretreated nitrocellulose membrane using 0.1 μL of Solution A, maintaining a 3 mm interval between each line.The quality control line was drawn using 0.1 μL of secondary antibody (AffiniPure Rabbit Anti-Mouse IgG (H þ L) (min X Hu Sr Prot) 315-005-045), positioned 3 mm after T3.After completing the lines, the membrane was dried in a 37 °C incubator for 2 h.This process was repeated three times to produce three identical test strips.
Web Platform and Mobile Application Development: The construction of a comprehensive web-based platform was undertaken to facilitate the utilization of the clinical scoring model.Advanced technologies and methodologies were utilized, ensuring the platform's functionality, performance, and security.The structure, esthetic, and interactivity of the platform were designed using HyperText Markup Language 5 (HTML5), CSS3, and JavaScript.Asynchronous JavaScript and XML (AJAX) techniques were implemented, enabling real-time communication between client-side and server-side components.
The management of data storage and processing was achieved through PHP and MySQL.Optimization strategies were incorporated to ensure the performance of the website.Adherence to web security best practices was maintained throughout the development process.Cross-browser compatibility testing was conducted to ensure a consistent user experience across various platforms.
Simultaneously, the development of the Clinical Scoring Model Android application was undertaken.The application was designed using Java, Android Jetpack, and the Gradle toolkit, which resulted in an efficient, visually appealing, and user-friendly interface.Components such as Responsive Design, LiveData, and ViewModel were integrated to enhance accessibility and handle user input effectively.
Communication with the back-end server was facilitated through HTTP client libraries via a RESTful API.Server-side data storage, processing, and retrieval were managed effectively using PHP and MySQL.Performance optimization strategies were implemented, and Android security best practices were adhered to throughout the development process.Compatibility testing was conducted across various Android versions and devices to ensure a consistent user experience.The culmination of this comprehensive and methodical approach was the creation of a user-friendly platform designed to accurately predict the risk of Talaromyces marneffei infection.

2. 1 .
Patient Clinical Data from Three Medical Centers A total of 464 patients diagnosed with AIDS were assessed, spanning from February 5th, 2014, to January 8th, 2022, across three medical centers in China, Figure S1, Supporting Information.These patients were methodically stratified into four distinct cohorts: a model development cohort (n = 318) and a holdout validation cohort (n = 45) from Wenzhou Central Hospital; an external cohort designated for model optimization (n = 49) from Hangzhou Xixi Hospital (external cohort 1); and a final external validation cohort (n = 52) from the First Affiliated Hospital of Zhejiang University School of Medicine (External Cohort 2).The development cohort's timeline encompassed the years 2014 through 2022, while the remaining cohorts spanned 2016 to 2021.

Figure 1 .
Figure 1.Logistic regression experimental group visualization.a) The radar chart of the performance of each step of stepwise logistic regression.At step = 1, all metrics are at their lowest values.When the stepwise forward method automatically concludes at step = 9, the AUC, accuracy, F1, and sensitivity metrics all reach their peak values, which are 0.946, 0.896, 0.922, and 0.915, respectively.b) Feature ranking of β coefficients based on standardized logistic regression.The red and blue colors represent risk factors and protective factors, respectively, for predicting Talaromyces marneffei infection.Among the risk factors, the variable "D-dimer" has the highest β coefficient value of 3.112.Conversely, among the protective factors, the variable" CD4þ" has the largest coefficient of À2.720.c) Four-cohort ROC curves of the optimal model based on standardized logistic regression.

Figure 2 .
Figure 2. Shapley additive explanations experimental group visualization.a) The heat map of the feature importance results of 32 features under SHAP interpretation in 12 machine learning models.Colored bars indicate the varying magnitudes of feature values.We present the results for the top seven features, as ranked by their cumulative SHAP values, from 12 machine learning models.b) The schematic diagram of SHAP.c) The comparative chart of fluctuations of SHAP results after correction by external cohort 1.Comparison of SHAP results for 12 machine learning models before and after calibration on external cohort 1.In comparison to the average AUC for each model on external cohort 1, blue signifies a relative decrease in model scores after calibration, while red indicates a relative increase in model scores after calibration.The larger the circle, the greater the degree of change.d) The ranking changes of the final three main features on 12 machine learning models under the SHAP explanation.

Figure 3 .
Figure 3. Permutation importance of experimental group visualization.a) The heat map of the feature importance results of 32 features under PI interpretation in 12 machine learning models.b) The schematic diagram of PI. c) The comparative chart of fluctuations of PI results after correction by external cohort 1.Comparison of PI results for 12 machine learning models before and after calibration on external cohort 1.In comparison to the average AUC for each model on external cohort 1, blue signifies a relative decrease in model scores after calibration, while red indicates a relative increase in model scores after calibration.The larger the circle, the greater the degree of change.d) The ranking changes of the final three main features on 12 machine learning models under the PI explanation.

Figure 4 .
Figure 4. From Weak to Strong Deduction Group visualization.a) Histogram of weight for seven features with weak Deduction I (Standardized Logistic Regression).b) Histogram of weight for first seven features with weak Deduction II (Shapley additive explanations).c) Histogram of weight for first seven features with weak Deduction III (permutation importance).d) Pie chart of proportion of weight for seven features with weak Deduction I (Standardized Logistic Regression).e) Pie chart of proportion of weight for first seven features with weak Deduction I (Standardized Logistic Regression).f ) Pie chart of proportion of weight for first seven features with weak Deduction I (Standardized Logistic Regression).g) Bar chart of the percentage distribution of three factors and the remaining factors.h) Boxplots of the AUC of machine learning algorithms and logistic regression.The red dot denotes the AUC of the three features logical regression.The AUC of the logistic regression with three features demonstrates considerable competitiveness across four distinct cohorts.

Figure 5 .
Figure 5. Simplest scoring system experimental group visualization.a) 3D visualization of sensitivity, specificity, and accuracy on four cohorts with different cut-off values.b) 3D visualization of sensitivity, specificity, and accuracy on development cohort with different cut-off values.c) 3D visualization of sensitivity, specificity, and accuracy on hold-out cohort with different cut-off values.d) 3D visualization of sensitivity, specificity, and accuracy on external cohort1 with different cut-off values.e) 3D visualization of sensitivity, specificity, and accuracy on external cohort 2 with different cut-off values.f ) K-means plot.The patients were unsupervised and clustered based on 32 features.The number of clusters K = 2, as recommended by the elbow graph, was used.g) Boxplots of the scores for the two clusters.Cluster 1 has statistically significantly higher scores than cluster 2 (n = 464, Mann-Whitney U test, p < 0.001).h) AUC of the scoring system in the development cohort, hold-out cohort, external cohort 1, and external cohort 2.

Figure 6 .
Figure 6.POCT test strips experimental group visualization.a) Flowchart of the procedure for using the POCT strip.b) Method for calculating the POCT strip scores.When only a distinct purple band appears on the control line (C line), D-dimer can be read as 0 points, and CD4 can be read as 6 points.When both the control line (C line) and the test line (T line) display distinct purple bands simultaneously, D-dimer can be read as 1 point, and CD4 can be read as 4 points.Proceeding in this manner, the scores for D-dimer and CD4 can be determined.The presence or absence of the rash will be adjudged as 2 points and 0 points, respectively.c) Schematic diagram of the POCT strip.d) A sample of colloidal gold solution.e) Results of the POCT strip.We present the colloidal gold test strip samples produced, categorized as: one control line (C line); one control line (C line) and two test lines (T lines); one control line (C line) and three test lines (T lines).

Figure 7 .
Figure 7. Web and app development result diagram.a) The workflow of creating the web and app page.b) Display of our free web platform.Building upon our preliminary infection screening model, we have devised a complimentary web-based platform to facilitate early detection of Talaromyces marneffei infection among medical professionals and patients alike.c) Display of our free app platform.To overcome constraints related to time, location, and equipment, our team has developed a versatile mobile application.This application is specifically designed for seamless integration and use within the Android operating system.