Predicting factors associated with anxiety by patients undergoing treatment for infectious diseases using a random-forest machine learning approach

0
Predicting factors associated with anxiety by patients undergoing treatment for infectious diseases using a random-forest machine learning approach

Study Site

A cross-sectional survey was conducted from March to April 2022 among COVID-19 patients in Ruijin Jiahe mobile cabin hospital, which is the first and representative cabin hospital used in Shanghai, with a large number of patients.

Ethics and inclusion statement

The research adhered to the principles laid out in the Declaration of Helsinki. The study has been approved by the Ethics Committee at Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, under the protocol code LL202070. All experiments were performed in accordance with relevant named guidelines and regulations. Procedurally, the principle of data minimization is adhered to, where only necessary data is collected to avoid unnecessary privacy risks. Informed consent mechanisms are implemented, requiring participants to provide explicit, voluntary, and informed consent prior to data collection, with the option to withdraw consent at any time. Data retention policies are also established, specifying appropriate retention periods and secure disposal or anonymization practices upon expiration. Anonymization techniques are applied to de-identify personal data, ensuring that collected information cannot be linked back to individuals. Access control mechanisms, including authentication and role-based permissions, further restrict data access to authorized personnel only. During the analysis phase, all data is de-identified to further protect participant anonymity.We conduct regular audits and security checks to ensure that our data security measures are effective and up-to-date. The data will be desensitized by a dedicated person and will only be analyzed by the research team. Study participant names and other personally identifiable information were removed from all text/figures/tables/images. Collectively, these measures demonstrate a commitment to ethical data practices, protecting participant privacy while facilitating responsible research and data utilization.

Participants

The inclusion criteria for participants are: (1) aged over 18 years old; (2) being positive in a SARS-CoV-2 nucleic acid test by collecting nasopharyngeal specimens and being identified as individuals with asymptomatic COVID-19 infection; (3) willing to participate and be able to use smartphone; (4) willing to provide informed consent. The exclusion criteria for participants are: (1) having a significant psychiatric condition or a cluster of severe conditions that might lead to heightened levels of anxiety or depression; (2) being pregnant or breast-feeding women; (3) refusing to participate or being unable to ensure compliance. For a detailed recruitment process, please refer to another published study18.

Procedure

A prospective observational study was conducted via the widely used online questionnaire survey platform “Questionnaire Star”. Logical design and validation checks were carried out to ensure that all necessary questions were answered. To confirm the status of COVID-19 patients, the nasopharyngeal swab test was used. We used an RNA isolation kit (Biogerm, Shanghai, China) to extract SARS-CoV-2 RNA from a nasopharyngeal swab specimen of patient. Subsequently, SARS-CoV-2 RNA detection was carried out using a kit designed for SARS-CoV-2 nucleic acid detection via real time reverse transcriptase-polymerase chain reaction (RT-PCR). All testing procedures complied with the standard of manufacturer’s instructions.

Measures

A self-reported questionnaire was used to gather participants’ information about sociodemographic characteristics, stigma, social support, and anxiety. Prior research has confirmed the psychometric adequacy of the Chinese-adapted versions of these instruments, with empirical evidence demonstrating satisfactory reliability coefficients and construct validity across multiple Chinese-speaking samples19,20,21. These culturally adapted measures have consistently exhibited strong factorial invariance and predictive validity within Mainland Chinese populations, supporting their contextual appropriateness for assessing relevant constructs in this linguistic and cultural milieu.

Sociodemographic characteristics

Characteristics including age, gender, educational level, marital status, the individual’s COVID-19 vaccination status, diagnosis time (number of days since the nucleic acid abnormality), psychological counseling experience in the past week were collected.

Anxiety

Zung Self-Rating Anxiety Scale (SAS) comprises 20 items designed to measure emotional states experienced during the preceding week22. Participants rate each item using a 4-point frequency scale (1 = rarely/never, 4 = always/nearly every day), with total initial scores calculated by summing item responses. These raw scores are converted to adjusted total scores via multiplication by 1.25. The cut-offs for the SAS standard scores were defined as: less than 50, no anxiety; over 50, anxiety. A validated Chinese adaptation of the SAS was employed in this study, with prior research demonstrating satisfactory reliability and construct validity for this translated version across Chinese-speaking populations19,23.

Stigma

Stigmatization was commonly reported in patients with serious infectious diseases and is significantly associated with negative psychological well-being in people with infectious diseases24,25,26. Social Impact Scale (SIS) was used to gauge perceptions of COVID-19-related stigma27. The scale is a widely accepted 24-item assessment tool employed for individuals with significant medical conditions and infectious diseases. The SIS consists of four sections, namely, social rejection which contains nine items, financial insecurity which contains three items,, internalized shame which contains five items, and social isolation which contains seven items. Respondents evaluated each SIS item on a scale of 1 (strongly disagree) to 4 (strongly agree), with cumulative scores ranging from twenty-four to ninety-six. Higher scores denote a more pronounced experience of more severe stigma. The Chinese version of the SIS has been rigorously validated and exhibits robust psychometric properties28,29 (Cronbach α = 0.970).

Entrapment

A meta-analysis was conducted to quantitatively synthesize the existing literature, revealing substantial associations between perceptions of defeat and entrapment and the presence of anxiety disorders30. The Entrapment Scale, a unidimensional instrument, comprises sixteen items with five response choices11. Total achievable scores on this scale range from zero to sixty-four. Elevated scores are indicative of heightened sensations of entrapment (Cronbach’s a = 0.973).

Defeat

A comprehensive review has yielded convergent evidence across multiple designs, participant groups, and measurement instruments, that perceptions of defeat and entrapment are closely associated with anxiety disorders31. The evaluation of “capturing the struggle of failure and feeling of losing rank” usually uses Gilbert and Allen’s Defeat Scale, which comprises of sixteen scales and two dimensions, to evaluate the feeling of defeat in the past 7 days. Higher scores are indicative of a greater tendency to feel defeated in daily life. Scores for positive questions yield positive scores, while scores for negative questions result in negative scores (Cronbach’s a = 0.912).

Coping styles

Research has demonstrated that individuals with elevated anxiety levels are more inclined to adopt emotion-focused coping mechanisms, such as avoidance behaviors, idealization, and self-blame. Conversely, problem-focused strategies like systematic planning and seeking instrumental social support exhibited significant inverse correlations with anxiety severity. The Simplified Coping Style Questionnaire (SCSQ) is a self-report measurement designed to assess coping styles, with a particular focus on problem-focused coping and emotion-focused coping. The SCSQ provides a comprehensive assessment of both state and trait coping styles, which are relevant to our research questions. The questionnaire consists of 20 items that are divided into two dimensions: positive coping and negative coping. The positive coping aspect encompasses items 1 to12, focusing on attributes associated with effective coping strategies, including “trying to see the good side of things as much as possible, and” finding several different ways to solve problems “; The negative coping dimension consists of items 13 to 20, focusing on attributes associated with negative coping strategies, including “relieving worry through smoking and drinking”. After each coping style item, four choices (0, 1, 2, and 3) are listed: not to use, occasionally to use, sometimes to use, and frequently to use (Cronbach’s a = 0.922).

Social support

Social networks provide both affective and instrumental resources that can fortify individuals’ self-concept and diminish emotional distress. These supportive connections, characterized by the exchange of emotional solace and practical aid within interpersonal relationships, serve as critical buffers against negative affectivity while promoting adaptive self-perceptions. Multidimensional Scales of Perceived Social Support (MSPSS) instrument32 is used to evaluate participants’ perceived social support stemming from personal relationships (including family, friends, and important person around). Respondents indicated their agreement levels on a 5-point Likert scale. Higher scores indicate higher social support (Cronbach’s α = 0.932).

Data analysis

Questionnaire data collection was performed by online questionnaire survey platform “Questionnaire Star”, and variables was exported in .csv format. Descriptive statistical analysis, variable selection analysis, model fitting and testing regression analysis were conducted by IMB Statistics SPSS 26.0 and R (version 3.6.3). To ensure that the version of custom code, software or algorithm described in the publication is maintained, we will publish it as a Supplementary document.

Descriptive statistical analysis

Continuous data were characterized by mean and standard error. Categorical data were described by frequency (percentage). We evaluated the association between anxiety and participants’ categorical sociodemographic characteristics using Fisher’s exact test. Differences in continuous variables between participants with anxiety and those without were evaluated using Mann–Whitney U test. All statistical analyses were two-sided, with a significance threshold set at P ≤ 0.05 to detect notable differences.

Variable selection method: Boruta

The Boruta algorithm can efficiently process datasets with a large number of features (high-dimensional) and nonlinear relationships. In this study, there were multiple independent variables with strong correlation. Unlike other methods that only focus on sorting or excluding individual features, the Boruta algorithm is dedicated to finding all features related to the dependent variable, which helps to gain a more comprehensive understanding of variable relationships in the dataset. In addition, the boruta algorithm can indirectly address collinearity problems to a certain extent and has higher interpretability.

Univariate regression

In univariate selection, we incorporate variables with a P value of ≤ 0.1, which signifies their significance in distinguishing between the anxiety and non-anxiety groups, into the univariate logistic regression analysis. When establishing a predictive model, a P significant level of value ≤ 0.05 are set is established. To evaluate the suitability of a logistic regression model, the Hosmer–Lemeshow test was used. When the P value obtained from the Hosmer–Lemeshow test exceeds 0.05, the model is deemed to exhibit satisfactory goodness of fit.

Development and validation of nomogram

Once the best prediction model was identified, a nomogram was constructed. The nomogram can integrate multiple predictive indicators and transform complex regression equations into visual graphs, making the results of prediction models more intuitive and understandable. They can provide specific prediction probabilities based on individual situations, to assess the risk of anxiety among COVID-19 patients. The performance of the nomogram was evaluated using receiver operating characteristic (ROC) curve. Discrimination was measured using the Concordance index (C-index), which ranges from 0.5 to 1.0, and internal validation was conducted through the bootstrap technique, which involves drawing random samples with replacement from the original dataset and fitting multiple bootstrap samples (in this case, 1000) to obtain a more reliable estimate of the C-index.

link

Leave a Reply

Your email address will not be published. Required fields are marked *