Personalised estimation of a woman's most fertile days.

Abstract Objectives: We propose a new, personalised approach of estimating a woman’s most fertile days that only requires recording the first day of menses and can use a smartphone to convey this information to the user so that she can plan or prevent pregnancy. Methods: We performed a retrospective analysis of two cohort studies (a North Carolina-based study and the Early Pregnancy Study [EPS]) and a prospective multicentre trial (World Health Organization [WHO] study). The North Carolina study consisted of 68 sexually active women with either an intrauterine device or tubal ligation. The EPS comprised 221 women who planned to become pregnant and had no known fertility problems. The WHO study consisted of 706 women from five geographically and culturally diverse settings. Bayesian statistical methods were used to design our proposed method, Dynamic Optimal Timing (DOT). Simulation studies were used to estimate the cumulative pregnancy risk. Results: For the proposed method, simulation analyses indicated a 4.4% cumulative probability of pregnancy over 13 cycles with correct use. After a calibration window, this method flagged between 11 and 13 days when unprotected intercourse should be avoided per cycle. Eligible women should have cycle lengths between 20 and 40 days with a variability range less than or equal to 9 days. Conclusions: DOT can easily be implemented by computer or smartphone applications, allowing for women to make more informed decisions about their fertility. This approach is already incorporated into a patent-pending system and is available for free download on iPhones and Androids.


Introduction
Surveys from numerous countries have shown that many women lack knowledge of when during their menstrual cycle they are most fertile. [1] Lacking this information, women who want to avoid pregnancy may have unprotected intercourse on their fertile days, potentially leading to unplanned pregnancy. [2,3] Women wanting to achieve pregnancy may be unsuccessful because of mistimed intercourse. Thus, knowing which days are most fertile is critical for these women.
Several fertility awareness methods exist to help women identify fertile days. These methods may require that users monitor fertility signs including: (1) cervical secretions (e.g., the Billings ovulation method [4] and Creighton method [5]); (2) basal body temperature [6]; or (3) a combination of these (e.g., the symptothermal method [7]). While some women may be willing to track these variables (via smartphone applications or standard charting) throughout their cycles and have access to the equipment required, e.g., basal temperature thermometer, others may prefer a simpler approach, especially for ongoing pregnancy prevention. A number of smartphone applications currently are also available as 'period trackers', though some acknowledge that they are not designed for pregnancy prevention. [8,9] An exception to the above is the Standard Days Method V R (SDM), a simple approach that is 95% effective for women with self-reported cycles lasting 26-32 days. [1] SDM users avoid unprotected intercourse during cycle days 8-19. [10] The SDM algorithm was derived from calculations of the variability of cycle length, timing of ovulation and fecundability in relation to ovulation. SDM accounts for these factors while maximising pregnancy protection and minimising the number of flagged days. Based on SDM, an existing application called CycleBeads requires the user to enter the first day of her period each cycle, and it provides information on fertile days and other pertinent topics.
But SDM and CycleBeads have certain limitations. Women eligible to use the method must have menstrual cycles that last between 26 and 32 days. Regardless of cycle variability within that range, 12 days each cycle (8-19) are considered fertile. [10] A method that allows shorter or longer cycles and adapts to an individual woman's cycle lengths and variability has the potential to increase access to and use of a fertility awareness method.
With advancements in technology, it is now possible to implement more sophisticated algorithms to develop such a method. Our proposed method, Dynamic Optimal Timing (DOT), uses modern Bayesian statistical methods and incorporates information from various fertility studies. Women only need to record their first day of menses, and DOT will flag the days with the highest estimated probabilities of pregnancy. As more data are collected, DOT updates these estimates. Below, we describe this new method and estimate its efficacy in preventing pregnancy in simulation studies.

Data
We used data from a study based in North Carolina, the Early Pregnancy Study (EPS) and a World Health Organization (WHO) study of the ovulation method of natural family planning. In the North Carolina study, 68 sexually active women (171 cycles) with either an intrauterine device or tubal ligation provided data for up to three menstrual cycles. [11,12] The EPS comprised 221 North Carolina women (696 cycles) who planned to become pregnant by discontinuing any use of birth control and had no known fertility problems. [12,13] Most of the participants in these studies were white, college-educated and in their late twenties or thirties. Every day, women from each study collected a firstmorning urine sample and recorded menstrual bleeding and unprotected intercourse. These data did not contain any identifiable patient information.
The WHO study consisted of 706 women (8118 cycles). The purpose of this study was to determine the effectiveness of the ovulation method of natural family planning. The study followed women from Auckland, Bangalore, Dublin, Manila and San Miguel (New Zealand, India, Ireland, Philippines and El Salvador, respectively) for up to 18 cycles. [14,15] Women admitted to the study were between 20 and 39 years old, had at least one child, and had menstrual cycle lengths between 23 and 35 days. The women were taught to monitor their cervical mucous secretions. Recorded variables included cycle lengths, timing of peak days and pregnancy outcomes. [16,17] These data are publically available and have existed for over 30 years.

Identifying the day of ovulation
In the North Carolina study and the EPS, the day of ovulation was estimated using serial changes in daily urinary hormones. [18] The day of ovulation was defined using an algorithm based on the change in the ratio of estrogen-toprogesterone metabolites around ovulation. [19,20] This algorithm has been validated as an accurate marker of ovulation which performs favourably relative to more direct measures based on luteinising hormone surge. [21,22] In the North Carolina study, diary data and urinary samples were missing for only 2% of days. In the EPS data, a clear day of ovulation was identifiable in 696 cycles from 213 women.
In the WHO study, the day of ovulation was approximated by the peak day. The peak day was defined as the last day on which slippery, raw egg-white-like mucus was recognised or the last day on which a wet or lubricated sensation was felt. [15] Overview of the DOT method We modelled the observed cycle lengths from all three datasets above using a Bayesian hierarchical model, and modelled the day of ovulation (predicted by cycle length) from the North Carolina study and EPS data using Bayesian linear regression. For a woman using DOT, we first incorporated her cycle length history into the cycle length model to estimate her next cycle length. We then used the regression parameters from the linear regression model to estimate her next day of ovulation. Using results from previous studies, we calculated the probability of conception given intercourse on each upcoming cycle day. Last, we flagged the highest risk days. These methods are described below in more detail. Supplementary Digital Content may be found online.

DOT step 1: estimating the next cycle length
We characterised the distribution of women's cycle length across the different datasets using a log t hierarchical model. This model uses a t distribution for the natural logarithm of the observed cycle lengths; the t distribution is similar to a normal distribution but includes an extra parameter deemed degrees of freedom (df). For small values of df, the t distribution allows for unexpectedly long or short cycle lengths; this flexibility is needed to fit the data. Each woman has her own specific mean and standard deviation to allow for the fact that average cycle length and variability in cycle lengths vary significantly between women. The model is 'hierarchical' because these woman-specific parameters are given a common population distribution, and the parameters of this distribution are estimated from the data. Related models have been used in previous analyses of cycle lengths. [23,24] An appealing aspect of the proposed hierarchical model is automatic updating (learning) of the woman-specific and population-level parameters as more cycle length data are added. Cycle length history is included in the model to obtain estimates of a woman's personal mean parameter, personal scale parameter and the overall mean, scale and df parameters. Using these parameters, we can estimate a woman's next cycle length.
DOT step 2: estimating the next day of ovulation Using linear regression, we modelled the observed days of ovulation from the North Carolina and EPS data using the corresponding cycle length as a covariate. We only modelled the day of ovulation from the North Carolina and EPS data because ovulation was more accurately estimated in these data than in the WHO data. To estimate the day of ovulation from cycle length, a common approach is to subtract 13-14 days. However, one advantage of linear regression is that it allows a more flexible characterisation of the relationship between cycle length and the day of ovulation. Using estimates of our linear regression parameters and the results from our cycle length model, we then estimated the probability of ovulation on each day of the next cycle.

DOT step 3: calculating fertility probabilities
A previous study has found the probability of a clinical pregnancy with a single act of intercourse to be 0.04, 0.13, 0.08, 0.29, 0.27 and 0.08 for the six consecutive days ending with ovulation, [23] and <0.01 outside this interval. As many previous studies have done, we fixed the probability outside the six consecutive days ending with ovulation as zero. [23,24] To calculate a woman's probability of pregnancy given intercourse on each day of her next cycle, we then summed the product of these pregnancy probabilities multiplied by our estimated ovulation probabilities. For example, the probability of pregnancy given intercourse on day 10 was estimated to be the probability of ovulation on day 10 multiplied by the probability of conception given intercourse on the day of ovulation (0.08) plus the probability of ovulation on day 11 multiplied by the probability of conception given intercourse on the day before ovulation (0.27), and so on.

DOT step 4: flagging high-risk days
We looked at various methods of flagging 'high-risk' days. Our goal was to obtain an estimated pregnancy rate comparable to that of SDM (4.75% after 13 cycles), while minimising the number of flagged days when a woman should avoid unprotected intercourse. The methods used to estimate the pregnancy rate are described in the next section.
In one such flagging method, an eligible woman's past 12 cycle lengths had to: (1) be between 20 and 40 days and (2) have a range less than or equal to 9 days (or 7 days after excluding the most extreme cycle length). We flagged women at a level of 1% when no cycles had been observed, and increased that flagging threshold in a roughly linear manner up to 2.25% after six or more cycle lengths had been observed. For ineligible women, we flagged days with a risk above 1.00% for all cycles. Alternative flagging methods are considered in the discussion.

Estimating theoretical efficacy
We randomly selected the data of 506 women in the WHO study to act as training data and used the remaining data of 200 women as test data. Women from the North Carolina study and EPS were excluded because they were followed for fewer cycles. The training data were used to calibrate our cycle length models, and the test data were used to estimate efficacy.
For each cycle, we compared our flagged days with the observed peak days. A previous study found that ovulation occurs within 3 days of the peak day, that 97% of ovulation occurs within 2 days of the peak day, and that 38% of ovulation occurs on the peak day. [25] Following these results, we set the probability of ovulation to be 0.015, 0.1475, 0.1475, 0.380, 0.1475, 0.1475 and 0.015 for the 3 days before to the 3 days after the observed peak day, and 0 for all other days. We also accounted for the fact that women do not engage in intercourse on every day of their cycle. One study of North Carolina women found the probability of engaging in a random act of intercourse to be 0.27, 0.34, 0.32, 0.32, 0.34, 0.33, 0.37 and 0.32 from 6 days before ovulation to the day after ovulation, and 0.25 for all other days. [22] We combined these probabilities with the observed peak days to obtain an unadjusted pregnancy risk for each day.
Following Schwartz et al., [26] we then calibrated a 'cycle viability' parameter so that applying the SDM to our test data women with cycle lengths between 26 and 32 days led to a 4.75% cumulative pregnancy rate. [10] We assumed that women avoiding pregnancy would not engage in unprotected intercourse on flagged days, and that woman desiring pregnancy would only engage in one act of intercourse on each flagged day. All analyses were conducted using R software (www.r-project.org).

Results
For a woman whose past six cycle lengths were each 28 days, we found that the next day of ovulation was most likely to occur on day 15 (13.6%), which would be expected if we had fixed the luteal phase to be 13 days long. [27] Figure 1 presents the probability of conception after intercourse on a given cycle day. We see that this woman is most likely to become pregnant if she has intercourse on day 13 (10.9%). There is also a greater than 1.0% chance of pregnancy anywhere from day 9 to day 20. Table 1 illustrates how DOT is able to adjust the flagged days as more data are collected. For a woman with a greater than normal average cycle length of 35 days, as more information is gathered, the flagged days occur later in the cycle, and the number of flagged days decreases from 16 to 12.
To compare the accuracy of DOT with that of SDM, we changed the flagging criteria to flag the 12 highest risk days. Figure 2 shows that even in our test data for women with cycle lengths between 26 and 32 days, DOT produced a pregnancy risk less than or equal to that of SDM from cycle 1 to 13. The cumulative pregnancy rate using our method was 3.95% in these women (compared with 4.75% using SDM). This improvement is a result of DOT's more personalised estimates. The example flagging method described above involves a variable threshold increasing from 1% to 2.25%. This was motivated by the observation that a fixed threshold results in highly elevated pregnancy risk when few cycles have been observed (Figure 3). This elevated risk is even more pronounced for particular groups of women, such as those having very long or highly variable cycles (data not shown). Using the variable threshold, the cumulative pregnancy rate was 4.4% in eligible women (Figure 3). After a calibration period in which at least five cycles were observed, between 11 and 13 days are flagged per cycle. Approximately 80% of the women in the WHO data were eligible.

Findings and interpretation
DOT can be made available through various software platforms and is currently available as a smartphone application. For women who do not want to track symptoms such as the quality of cervical secretions and basal body temperature, DOT provides a simple alternative for estimating fertile days. Users only need to record their first day of menses, and their predicted most fertile days for the next cycle will then be displayed. All calculations are automatically performed by DOT, so this easy usability may help decrease errors and increase the rate of perfect use.
Unlike other calendar-based methods such as SDM, DOT is also effective for women with shorter or longer than average cycle lengths. Initial results suggest a cumulative 4.4% pregnancy rate over 13 cycles with perfect use for women whose cycle lengths are between 20 and 40 days and have a range less than or equal to 9 days. This is in contrast to SDM, which is only effective for women with cycle lengths between 26 and 32 days.
DOT also better informs users about the uncertainty in estimating their fertile days when only cycle lengths are recorded. Some existing smartphone applications will only flag 6 days per cycle as the 'fertile window', [8,9] even if users are only recording their first day of menses. However, this is potentially misleading because of natural variability in the length of the follicular phase and timing of ovulation. [27] DOT accounts for the small probability that ovulation might occur earlier or later than expected and incorporates this into its estimates. As illustrated in Figure 1, DOT can let a user know when her conception risk is 1% or greater -in this example as early as day 7 or as late as day 20.
Strengths and weaknesses of the study As mentioned above, there are several advantages to DOT. DOT is applicable to a larger group of women compared with other calendar-based methods, and DOT more rigorously calculates the probability of conception on each cycle day. Our study has a number of strengths: one is the quality of the data we used to design and evaluate DOT. The North Carolina study and EPS are rich datasets that have been extensively analysed in the past, and the WHO data comprise a larger dataset that includes information about the menstrual cycles of many women across the world. We also used sophisticated but appropriate statistical models for a woman's cycle length and day of ovulation. These factors allow us to better understand when a woman's next period will arrive and predict her most fertile days.
An important limitation of DOT is that it is most effective for women with regular cycles. DOT may not be appropriate for women with irregular or missed periods. This may include some athletes, patients with certain medical conditions (e.g., polycystic ovary syndrome, thyroid disorders, diabetes), adolescents just starting menses and women near Figure 2. A plot comparing the risk of pregnancy in our test data women with observed cycle lengths between 26 and 32 days using SDM versus a modified version of DOT. In this example, DOT was calibrated to flag the 12 highest risk days. Women were at a lower or equal risk of pregnancy compared with DOT for all cycles. Figure 3. A variable flagging threshold helps limit increased pregnancy risk at the beginning of method use (cycle numbers 1-6). Error bars indicate the standard error in risk for the test data women. The cumulative pregnancy risk is over the 13 cycles and scaled to be equivalent to the per-cycle risks. the menopause. DOT does not account for the effects of acute or chronic stress on a woman's hormonal balance and period regularity, and DOT assumes that all bleeding is menstrual bleeding (it does not account for mid-cycle or anovulatory bleeding). Although the three datasets used estimates of the day of ovulation, the uncertainty in estimating ovulation day was properly taken into account in the analysis. Minor gains in prediction accuracy may be possible if the ovulation day could be measured perfectly, but even direct measures based on the luteinising hormone surge have errors.
Another limitation of DOT is that most of the women in the North Carolina study and EPS were white, well educated, between the ages of 25 and 35, in a stable sexual relationship and from the same geographical area. The relationships observed between cycle lengths and the day of ovulation in these data may not be generalisable to women of different backgrounds, and it is unclear how this may affect our estimates. The women in these studies also were not actively avoiding pregnancy. However, besides these demographic differences, there are no obvious biological differences between the North Carolina and EPS women and those who may be interested in using DOT that may limit the application of our estimates.
There are also limitations to our study, such as in how we assessed efficacy for women avoiding pregnancy. We assumed there would be no unprotected intercourse on flagged days for pregnancy avoiders, but couples are unlikely always to abide by this. This would lead to an overestimate of efficacy. The intercourse behaviours from the North Carolina study also may not be representative of the women who want to use DOT, and it is unclear how this would affect efficacy. However, the North Carolina couples were mostly young and healthy and tended to have relatively high intercourse frequencies compared with many other groups, which would lead to an underestimate of efficacy. A prospective multicentre study is needed to estimate more accurately the cumulative pregnancy rate using DOT. Such a trial is currently underway.

Relevance of the findings: implications for clinicians and policy-makers
There are many existing smartphone applications that help women track their cycles. However, unlike DOT, none of these applications is specifically designed to help prevent pregnancies using just the first day of menses. DOT only requires users to record their first day of menses and have access to a smartphone. DOT accounts for natural variability in predicting cycle length and day of ovulation, it allows users to see their probability of conception on each cycle day, and it minimises both the number of flagged days and the cumulative risk of pregnancy.

Unanswered questions and future research
A study to determine both the efficacy of DOT (the proportionate reduction in the probability of pregnancy during perfect use compared with use of no method) and its effectiveness (reduction in probability during typical use, including both perfect and incorrect use) is being conducted by the Institute for Reproductive Health. This longitudinal, prospective study is designed according to the guidelines developed by Trussell and Kost [28] for studies of contraceptive methods. It will also explore the ability of women to understand and use the application.
In the future, we may also consider enabling user interaction so that each woman can set her own unique pregnancy risk versus flagged day preferences. For example, instead of flagging all days with a pregnancy risk above a certain threshold, we could create a loss function with weights on the estimated pregnancy risks and weights on the number of flagged days per cycle. Depending on a woman's personal preferences, we could appropriately adjust these weights. This may help to further reduce the number of flagged days and improve adherence.

Conclusion
DOT is a simple way for women to track the progression of their menstrual cycle. It is currently available for free download as an application on Apple and Android devices. Because of improvements in computational technology, we are able to implement more sophisticated statistical methods without sacrificing usability to identify a woman's risk of pregnancy. This provides women with information that can be effectively used to prevent or plan pregnancy. the United States Agency for International Development (Washington, DC, USA) for their support in this work.

Disclosure statement
Cycle Technologies provided funding for analyses and helped with the design of the study and preparation of the report.