In the United States, a pregnant individual will attend approximately 15 prenatal visits with a medical provider for the monitoring of an uncomplicated pregnancy. During these visits, a vast amount of demographic and clinical information will be collected and entered into the electronic health record (EHR). Much of the information is related to monitoring the pregnancy–such as weight gain, blood pressure, urinalysis; however, there is information in the medical record that could be used to predict risk for perinatal depression. Several recent studies have sifted through this information gleaned from the electronic health record, using machine learning to generate algorithms that could be used to estimate risk for postpartum depression. 

Estimation of Risk of PPD in First-Time Mothers

In a retrospective cohort study analyzing data from the NIH Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be, Wakefield and colleagues examined the medical records of 10,038 first-time mothers. They tested the performance of four different models for predicting risk:

  • Model 1 utilized only readily obtainable sociodemographic data. 
  • Model 2 also included data on maternal mental health prior to pregnancy.
  • Model 3 utilized recursive feature elimination to construct a parsimonious model.
  • Model 4 further titrated the input data to simplify prepregnancy mental health variables. 

The analysis included 8,454 births; 338 (4%) received treatment for depression (as documented in the EHR) during the postpartum period. In terms of predicting women who would later require treatment, model 3 performed the best, with an area under the receiver operating characteristics curve of 0.91 (±0.02). This means that this model would identify 91% of women who ultimately require treatment for depression.

The models identified nine variables that were the most robust predictors of postpartum depression treatment: maternal history of depression (highest), any current mental health condition, recent psychiatric medication use, BMI, income, age, history of anxiety, education level, and preparedness for pregnancy (lowest). 

Estimation of Risk for PPD with Model Including EPDS Scores

In another study, Amit and colleagues analyzed EHR data from 266,544 women in the United Kingdom who gave birth to their first child between 2000 and 2017. A subset of 5959 women also had Edinburgh Postnatal Depression Scores (EPDS) scores recorded in the EHR. The researchers extracted multiple socio-demographic and medical variables and constructed a machine learning model to predict the risk of PPD during the year following childbirth. 

In this cohort, the prevalence of PPD was 13.4%. PPD was defined based on the occurrence of one of the following documented in the EHR during the first year postpartum: (1) diagnosis of depression; (2) new treatment with antidepressant; or (3) non-pharmacological treatment for depression.

In a model using only data derived from the EHR, the area under the curve (AUC) of the prediction model ranged from 0.72 to 0.74. Interestingly, the model worked fairly well when only prepregnancy data was used to predict risk; the EHR-based prediction model administered before pregnancy identified at least 70% of women who were later diagnosed with PPD.

When the model combined EHR-based data with EPDS scores, the area under the receiver operator characteristics curve (AUC) increased from 0.805 to 0.844, with a sensitivity of 0.76 at a specificity of 0.80. In other words, the best predictive model could identify at least 80% of women who would later be diagnosed with PPD.

The factors most strongly associated with risk of PPD included history of antidepressant use, history of depression, number of antidepressant prescriptions filled, younger age, BMI, smoking status, and deprivation index.

Can We Use the Electronic Medical Record to Predict Risk for PPD?

The answer is YES. Both studies indicate that machine learning can be used to construct a model using data collected from the EHR that can be used to predict risk for depression within the first year after childbirth. In these two studies, the predictive models were able to identify between 84% and 91% of women at risk for developing depression after delivery. Statisticians get pretty excited about screening tools when the AUC is greater than 0.8.

Now for the caveats. Both of these studies document the diagnosis or treatment of PPD using documentation in the electronic health record: either documentation of the diagnosis itself or treatment (antidepressant or non-pharmacologic treatment). While the prevalence of PPD in the Wakefield study carried out in the US was 4%, the Amit study from the UK reported that the prevalence of PPD was 13.4%. 

Based on previous epidemiologic studies, the prevalence of PPD is typically around 15%. There are no studies indicating that the prevalence of PPD is lower in the US than in the UK; it should be noted that the US study is looking only at the prevalence of treatment for PPD, whereas the UK study is looking at the prevalence of diagnosis and/or treatment. This discrepancy is consistent with previous studies indicating low rates of treatment among women with PPD where lower rates of treatment reflect underdiagnosis of perinatal mood and anxiety disorders, as well as barriers to accessing treatment in the US. 

The models described in these studies are probably identifying individuals at risk for more severe PPD and not women with less severe depressive symptoms. While it is essential to identify women with the most severe symptoms in order to limit morbidity in both the mother and the child, we may be missing an opportunity to support other mothers who are also struggling during the postpartum period. Further studies are needed to test these predictive models in more diverse populations, including multiparous mothers, and using a broader definition of perinatal depression.

Ideally we would like to be able to identify women at risk for postpartum depression before it occurs. This would not only allow us to increase monitoring when needed and to treat early if PPD emerges, but it may also provide an opportunity to initiate preventative interventions. Currently our strongest predictors of risk include a history of depression prior to pregnancy and depressive symptoms during pregnancy. These models build on these robust risk factors, and include other risk factors (i.e., age, BMI) to improve our ability to predict and quantify risk. 

Ruta Nonacs, MD PhD


Wakefield C, Frasch MG. Predicting Patients Requiring Treatment for Depression in the Postpartum Period Using Common Electronic Medical Record Data Available Antepartum. AJPM Focus. 2023 Apr 27;2(3):100100. 

Amit G, Girshovitz I, Marcus K, Zhang Y, Pathak J, Bar V, Akiva P. Estimation of postpartum depression risk from electronic health records using machine learning. BMC Pregnancy Childbirth. 2021 Sep 17;21(1):630.

Related Posts