Assessing mental health in individuals near thermal power plants and development of depression predictive model
Selection of a thermal power plant
Guru Gobind Singh Super Thermal Power Plant (GGSSTPP), located in Rupnagar, is one of the oldest and largest operational thermal power plants in Punjab, which was commissioned in the 1980s. Its long history of operation has led to sustained and cumulative emissions, making it an ideal site to study the long-term exposure effects on nearby residents. The densely populated villages within a 5 km radius of GGSSTPP enabled us to enroll a large, stable, and diverse group of study participants. These villages experience limited migration, ensuring consistent environmental exposure among the residents. Furthermore, government and media reports have highlighted ongoing community complaintsregarding air quality and health concerns in this region, emphasizing the relevance and urgency of this investigation.
Study population
This cross-sectional study was conducted from October 2018 to March 2019 around the GGSSTPP, in Rupnagar, Punjab, India. The target population consisted of residents from villages located between 1 and 5.5 km from the power plant. The selected distance range was based on existing literature and environmental health guidelines, which suggest that populations within a 5 km radius of coal thermal power plant are at higher risk of exposure to air pollutants and associated health risks13,14 The lower limit of 1 km was set to exclude areas within the immediate industrial zone, where population density is low and direct occupational exposure might confound community-level findings. The upper limit was slightly extended to 5.5 km to ensure adequate sample size and to capture populations on the fringe of the commonly studied 5 km zone, allowing for a more robust assessment of spatial exposure gradients. Data was collected from the following villages: Alipur, Ghanouli, Kotbala, Majri, Begumpura, Saini Majra, Jatt Patti, Rattanpura, Singhpura, Inderpura, Rawal Majra, and Doburji. The study included individuals aged 19–60 who had lived near the area for more than 5 years.
Exclusion criteria
Pregnant women and adults who had relocated within the past 2 years were excluded. Long-term exposure to environmental stressors frequently results in mental health issues such as stress, anxiety, and depression. Individuals who had only lived there for a short time would have introduced variability in exposure levels, potentially complicating the results. To strengthen the validity of its findings, the study focused on long-term residents, ensuring consistent and significant exposure histoies among participants.
Survey design
A two-stage random sampling was used for data collection, with the Rupnagar district as the unit of analysis. The first stage involved selecting villages, followed by a random selection of households within those villages. The sample area was determined using municipal wards, census blocks, or community development blocks as units. Field investigators and medical social workers addressed the village sarpanch (head) or other community leaders to solicit their aid in establishing the survey’s boundaries and encourage community participation.
Using quota sampling, ensuring adequate representation of the population across age groups, genders, and socioeconomic statuses, reflecting the demographic diversity of the area15. We selected the communities within 5 kilometers of the power plant to capture data from different exposure levels. This approach aligns with standard practices in environmental health research for assessing pollution impacts16. By using villages/colonies as the primary unit and households as the secondary unit, this method minimizes selection bias and improves representativeness17.
Sample size
To get the representative sample size, we applied the following formula
$$n\,=\,\left({Z}^{2}* p\left(1-p\right)\right)/\,{{\rm{e}}}^{2}$$
where
-
n = sample size,
-
z = z – score of the selected level of confidence,
-
p = standard deviation (use 5 if unknown depending on the study)
-
e = margin of error
-
For a 95% confidence level and 5% margin of error, which are commonly used in research, the calculation would be
$${n}=({1.96}^{2}* 0.5(1-0.5))/0.0{5}^{2}=384.16$$
The final sample size for our study was 359 participants from 12 villages within a 5.5 km radius of the coal thermal power plant, slightly below the typical sample size of 385. This sample size was chosen to ensure population representativeness and sufficient statistical power. A sample size of 359 is nearly equivalent to the sample size of 385 for most population sizes, providing sufficient statistical power to ascertain distinguishing sample differences18. To account for a 10% anticipated non-response rate, adjustments were incorporated into sample size estimation. This approach ensures that analyses have the statistical power necessary to identify significant impacts.
Socioeconomic stratification
To explore the relationship between economic disparities, environmental risks, and mental health, participants were divided into two income groups: those earning less than one lakh (~1200$) and those earning more than one lakh19.
Household air pollution assessment
Based on known associations between solid fuel use and health risks in poor countries, participants were categorized based on the type of fuel used in their household: solid/mixed fuels vs. clean fuels—LPG20. This multimodal sampling aims to ensure that the sample accurately reflects the diverse population affected by pollution from coal thermal power plants. It enables broader applicability of results and a comprehensive understanding of how environmental and socioeconomic factors influence psychological symptoms in populations affected by pollution.
Study questionnaire
The study area was divided into buffer zones ranging from 1 to 2 km around the centroid of the power plant. The Depression, anxiety and stress scale (DASS-21) questionnaire was used to assess psychological symptoms. The DASS-21 is a reliable and user-friendly tool, widely used in both clinical and research settings. It includes 21 self-rated questions covering three scales: depression, anxiety, and stress, with 7 items per scale. The depression scale assesses feelings of hopelessness, self-deprecation, and anhedonia, while the anxiety scale measures autonomic arousal and anxious affect. The stress scale evaluates chronic nonspecific arousal, including nervousness and irritability18.
Participants were categorized based on age, gender, income, and cooking fuel types (LPG or mixed fuels). A total of 359 individuals were included in the study, divided into normal BMI and overweight BMI21. Of the participants, 145 had normal BMI, while 214 were classified as overweight BMI. Psychological symptoms (depression, anxiety, stress) were further categorized based on DASS-21 guidelines into two groups: those with psychological symptoms and those without, after adding all the scores.
Statistical analysis
Data analysis was conducted using binary logistic regression to examine the association between psychological symptoms and factors such as household air pollution, income, age, and gender. Odds ratios were calculated to assess the impact of each factor on psychological symptoms. For BMI, participants were categorized into two groups: normal and overweight, based on established criteria22. Age was divided into three groups: 19–29 years, 30–49 years, and 50+ years15. Income was classified into two categories: less than one lakh and one lakh or more19. Gender was categorized as male or female, and household air pollution was classified into two categories: solid/mixed fuels and clean fuels (LPG). For the depression predictive model, age and income were treated as continuous variables, with the goal of assessing how changes in age or income affect the odds of depression using binary logistic regression.
A total of 359 participants were included to develop a predictive model for depression using binary logistic regression. Binary logistic regression is used when the dependent variables are binary (e.g., Yes vs. No), with independent variables that can be categorical, continuous, or both23,24 The analysis was performed using SPSS Statistics 26 and R Studio v4.2.2.
Additionally, the association between depression and independent variables (age, gender, cooking fuel types, income, stress, and anxiety symptoms) in the general population was analyzed using binary logistic regression. A depression predictive model was also developed in R Studio v4.2.2. To compare normal and overweight participants, statistical differences were assessed using t-tests and ANOVA tests in the SciPy library of Python.
Ethics approval
The study was approved by the Institutional Ethics Committee, PGIMER, Chandigarh vide letter No. PGI/IEC/2017/97.
Consent to participate
Participants were fully informed about the study’s purpose and procedures. They signed a consent form indicating their voluntary participation and acknowledging that all information provided would be kept confidential.
Assumptions
The applicability of our findings to a larger population is limited by the small sample size of 359 individuals, which may affect subgroup analyses, such as age or gender differences. Small sample sizes can lead to type II errors, obscuring important correlations25,26. Additionally, the cross-sectional design makes it challenging to establish causality between environmental factors and mental health outcomes. While correlations between fuel consumption and mental health symptoms were observed, we cannot determine the directionality of these relationships or rule out confounding factors, such as pre-existing mental health conditions. As highlighted in a previous study23 longitudinal studies would be better suited for exploring these causal links.
Gender-related biases may also influence our findings, as females are more likely to report psychological issues compared to males27. Furthermore, our study was limited to neighborhoods near a single power plant; this might affect the generalizability, as results may not apply to areas with different pollution levels or socioeconomic conditions. Environmental health impacts can vary widely across regions28. Finally, while we used fuel type as a proxy for pollution exposure, we did not measure individual-level exposure. This limitation may have led to less precise estimates of the relationship between pollution and mental health outcomes. Hence, as mentioned by Reuben et al.29 more accurate exposure assessment techniques could provide deeper insights into this connection. We had also considered assumptions like there should not be multicollinearity in the datasets and linearity in the logit.
Another limitation of this study is the time gap in data collection (2018–2019). While sociodemographic and environmental variables in the region have largely remained unchanged, there may have been shifts in healthcare access, environmental factors, or public health policies that could affect the applicability of our findings to the current situation. However, we believe the findings still highlight the ongoing mental health in communities near thermal power plants, especially in the absence of major infrastructure improvements or interventions.
link
