Prediction of psychosis using neural oscillations and machine learning in neuroleptic-naı¨ve at-risk patients

Objectives : This study investigates whether abnormal neural oscillations, which have been shown to precede the onset of frank psychosis, could be used towards the individualised prediction of psychosis in clinical high-risk patients. Methods : We assessed the individualised prediction of psychosis by detecting specific patterns of beta and gamma oscillations using machine-learning algorithms. Prediction models were trained and tested on 53 neuroleptic-naı¨ve patients with a clinical high-risk for psychosis. Of these, 18 later transitioned to psychosis. All patients were followed up for at least 3 years. For an honest estimation of the generalisation capacity, the predictive performance of the models was assessed in unseen test cases using repeated nested cross-validation. Results : Transition to psychosis could be predicted from current-source density (CSD; area under the curve [AUC] ¼ 0.77), but not from lagged phase synchronicity data (LPS; AUC ¼ 0.56). Combining both modalities did not improve the predictive accuracy (AUC ¼ 0.78). The left superior temporal gyrus, the left inferior parietal lobule and the precuneus most strongly contributed to the prediction of psychosis. Conclusions : Our results suggest that CSD measurements extracted from clinical resting state EEG can help to improve the prediction of psychosis on a single-subject level.


Introduction
Schizophrenic psychoses are increasingly acknowledged as neurodevelopmental disorders whose signs and symptoms can sometimes be observed as early as in childhood (Insel 2010). A delay between the diagnosis and the treatment of these disorders ranges from 1 to 3 years (Riecher-Rö ssler et al. 2006) and could result in severe negative ramifications such as a worse functional outcome (Insel 2010), loss of grey-matter volume (Fusar-Poli et al. 2011a), higher cognitive deterioration (Amminger et al. 2002), higher dosage of neuroleptics needed (McGorry et al. 1996) and higher overall treatment costs (Moscarelli 1994). By contrast, therapeutic actions in the earliest phases of the disease could considerably improve the prognosis of these individuals (Stafford et al. 2013).
In the last two decades, there has been increased interest in the early detection of psychosis and reliable criteria have been established internationally to detect an at-risk mental state (ARMS) for psychosis. As only about one-third of ARMS patients eventually develop psychosis (Fusar-Poli et al. 2012), and about one-third remit from their risk state (Simon et al. 2013), further riskstratification is required to identify subgroups with specific needs and response patterns that could improve the cost-benefit ratio of preventive interventions . Although individualised prediction models for psychosis based on structural magnetic resonance imaging (MRI) achieved promising predictive accuracies (Koutsouleris et al. 2014), it has not been investigated whether the same could be obtained using more cost-efficient measures, such as clinical restingstate EEG.
Although both an increase and decrease in gamma activity has been noted in patients with schizophrenia (Herrmann and Demiralp 2005), heightened activity has consistently been reported in unmedicated patients suffering from positive symptoms while the opposite has largely been found in those suffering from negative symptoms (Herrmann and Demiralp 2005;Lee et al. 2003). While the gamma band tends to be associated with the precise timing of neural interactions in localised small networks (Uhlhaas and Singer 2013), the beta band is known to be predominantly involved in modulating neural communications, albeit with a reduced precision, amongst broadly distributed cortical networks (Kopell et al. 2000). For instance, it has been shown that beta oscillations mediate interactions of distributed functional networks involved in multimodal sensory processing and sensory-motor coordination during normal brain functioning (Uhlhaas and Singer 2013). Interestingly, all these attentional processes are deeply perturbed in patients with schizophrenia (Morris et al. 2012), suggesting a disturbed long-ranged neural communication.
Given these accumulated evidence on the associations between abnormal high-frequency oscillations and schizophrenic psychoses, high-frequency oscillations may also be altered in ARMS patients with later transition to full blown psychosis (ARMS-T) and could therefore serve as neurophysiological biomarkers for predicting psychosis. In fact, in a recent study Ramyead et al. (2014) have demonstrated, using a univariate approach, that ARMS-T patients are characterised by atypical gamma current-source density (CSD) in the medial prefrontal cortex along with abnormal lagged phase synchronisation (LPS) of beta1 oscillations across brain areas. However, they did not investigate whether these group differences could be exploited to make accurate predictions on a single-subject level, thereby potentially contributing to informed clinical decisionmaking. Thus, the aim of this study was to investigate whether an accurate prediction of psychosis can be achieved by CSD, LPS or both measures at the same time in order to detect potential signatures associated with later transition to psychosis. To this end, we applied state-of-the-art machine-learning algorithms to source localised clinical EEG measures.
In the past few years, there has been an increased interest in applying multivariate pattern-recognition algorithms in various fields, ranging from genomics (Liu et al. 2013) to cybersecurity (Dua and Du 2011), with substantial success. Although these techniques have been successfully applied to neuroanatomical (Koutsouleris et al. 2014) and neuropsychological data (Koutsouleris et al. 2012) in order to predict psychosis in ARMS patients, no study has applied them to clinical resting-state EEG data. This is surprising as in many early detection centres for psychosis, resting-state EEG is routinely used in ARMS patients for signs of organic brain disorders such as limbic encephalitis or epilepsy with relatively low cost. Moreover, neuronal oscillations have been strongly associated with the pathophysiology of schizophrenic psychoses (Uhlhaas and Singer 2013). Furthermore, previous studies using clinical EEG for prediction of psychosis are limited as they included atrisk patients already medicated with antipsychotics (up to about 40% in one instance (van Tricht et al. 2014)). These medications have been shown to change neural oscillations (Centorrino et al. 2002) and are likely to alter the natural trajectory of psychosis, therefore potentially yielding misleading biomarkers.
To overcome these weaknesses, we set out to employ a machine-learning algorithm, the least absolute shrinkage and selection operator (LASSO), that detects multivariate patterns of high-frequency oscillations across different brain regions on the same sample of patients as in our previous study (Ramyead et al. 2014). However, compared to the previous sample, we excluded the patients who were already medicated with neuroleptics due to the reasons mentioned earlier. To make use of three-dimensional CSD and LPS of high-frequencies (beta1, beta2, gamma) oscillations as input variables, we applied the inverse solution technique exact lowresolution electromagnetic tomography (eLORETA), which allows for a reliable source localisation of brain activity analyses at individual frequencies ). Finally, we conducted our analysis on a group of ARMS patients which were followed-up over the course of at least three years to determine whether they later made transition to psychosis (ARMS-T) or not (ARMS-NT). We hypothesised thatbased on their specific pattern of CSD and/or LPS in the high frequencies at 19 cortical regions of interests (ROIs) -ARMS-T individuals could be separated from ARMS-NT individuals with good accuracy.

Setting and recruitment
The EEG data analysed in this study were collected as part of the Basel Frü herkennung von Psychosen (FePsy) project, a prospective multilevel study aiming to improve the early detection of psychosis (Riecher-Rö ssler et al. 2007;Riecher-Rö ssler et al. 2009). The study was approved by the ethics committee of the University of Basel. All participants provided written informed consent. Patients recruited for this study were help-seeking consecutive referrals to the FePsy Clinic at the University Psychiatric Clinics Basel, which was specifically set up to identify, assess, and treat individuals in the early stages of psychosis. Most participants were referred to the early detection clinic via the University Psychiatric Outpatient Department of Basel or a psychiatrist in private practice. Some individuals were also referred from other physicians, including general practitioners, or came on their own.

Screening procedure
The Basel Screening Instrument for Psychosis (BSIP; Riecher-Rö ssler et al. 2008) was applied to identify ARMS individuals. The BSIP is largely based on the PACE inclusion/exclusion criteria (Yung et al. 1998) and has been shown to have a high predictive validity and a good interrater reliability (Riecher-Rö ssler et al. 2008). Exclusion criteria for patients were age 518 years, insufficient knowledge of German, IQ570, previous episode of schizophrenic psychosis treated with antipsychotics, psychosis clearly due to organic reasons or substance abuse, or psychotic symptoms within a clearly diagnosed depression or borderline personality disorder. For this study, we included all ARMS patients that were recruited for the FePsy study between March 2000 and August 2012 and had a clinical EEG session of at least 15 min at baseline assessment. They were followed-up at regular intervals in order to distinguish those who later transitioned to frank psychosis (ARMS-T) from those who did not (ARMS-NT). During the first year of the follow-up, ARMS individuals were assessed for transition to psychosis monthly, during the second and third years they were assessed every 3 months, and thereafter annually using the transition criteria of Yung et al. (1998). In this study, individuals were only classified as ARMS-NT if they had a follow-up duration of at least 3 years and did not develop frank psychosis.

Assessment of positive and negative psychotic symptoms
The Brief Psychiatric Rating Scale Expanded (BPRS-E; Lukoff et al. 1986;Ventura et al. 1993) was used to assess positive and negative psychotic symptoms. The positive psychotic symptom scale was based on the four items hallucinations, suspiciousness, unusual thought content, and conceptual disorganisation and the negative psychotic symptom scale was based on the items blunted affect, psychomotor retardation and emotional withdrawal, as defined by Velligan et al. (2005).

EEG recordings
EEG data were recorded at the University Hospital of Basel. Patients sat in a quiet room in an eyes-closed resting-state condition for about 20 min. Every 3 min, subjects were asked to open their eyes for a period of 5-6 s. At any signs of behavioural and/or EEG drowsiness, the patients were verbally asked to open their eyes. EEG data were sampled at a rate of 250 Hz by 19 gold cup electrodes (Nicolet Biomedical Inc., Madison, Wisconsin) referenced to linked ears. Electrodes impedances were kept below 5.

Artefact rejection
EEG pre-processing was performed using Brain Vision Analyzerß 2.0 software (Brain Products GmbH, Munich, Germany). We processed each EEG in parallel split into two branches, one filtered at 0.5 Hz and one at 1 Hz. We did so in order to apply the Independent Component Analysis (ICA) matrix from the most stable signal (1 Hz) to the one that conserved the most signal (0.5 Hz). Both branches were handled in the same way up to the step that involved re-referencing to the common average. As a first step, artefact rejection was performed manually, based on visual inspection, to remove epochs containing extreme ocular artefacts, muscles and/or cardiac contamination and bad signals due to random movements. Biased extended Infomax ICA analyses were then performed for the removal of residual eye movements, eye-blinking, muscles and non-biological components contaminated with high gamma frequencies of 50 Hz and above as measured by Fast Fourier Transform (FFT) of the ICA components (resolution at 1 Hz, power mV2, Hanning window length of 10%). After applying the ICA corrected matrix of the data filtered at 1 Hz to the one filtered at 0.5 Hz, we re-referenced the data to common average. Finally, another manual rejection based on visual inspection was performed to exclude remaining artefacts as mentioned earlier.

CSD analyses
EEGs were transformed into reference-free CSD estimates achieved by the Laplacian Weighted Minimum Norm algorithm (Pascual-Marqui 2007). Compared to conventional EEGs based on voltage, accumulated evidence have indicated that the use of CSD as a measure of brain activity allows reliable spatial analysis (Michel et al. 2004) by disentangling the EEG signals from various biological and non-biological artefacts, thus yielding measures that more closely represent the neuronal current generators (Tenke et al. 2011).
The electrode montage in the present study has been shown to be an acceptable EEG spatial sampling for the estimation of cortical sources of eyes-closed restingstate EEG rhythms as these oscillatory rhythms are widely characterised across all human cerebral cortex when compared to the demarcated functional topography of event-related EEG changes. Consequently, the oscillatory rhythms acquired during eye-closed resting-state EEG can properly be sampled with a relatively low number of electrodes, as opposed to the higher density electrode montage required for observing the functional topography of stimuli-related EEG activity (Babiloni et al. 2013). This relatively low-spatial sampling of EEG oscillatory rhythms is robust as LORETA solutions are intrinsically maximally smoothed at source space thanks to its regularisation procedure (Pascual-Marqui et al. 1994;Babiloni et al. 2013;).
To compute the intracortical CSD of neural oscillations, we used eLORETA (Pascual-Marqui et al. 2011) on EEG data segmented into 2-s epochs (671 epochs on average, and groups did not significantly differ in the number of segments). eLORETA is a neurophysiological imaging technique based on weighted minimum norm inverse solution procedures allowing for the 3D modelling of the EEG CSD with an exact localisation performance, with a high correlation of neural sources that are in close proximity. Numerous studies based on neuroimaging tools, such as functional (Mulert et al. 2004) and structural MRI (Worrell et al. 2000), positron emission tomography (PET; Zumsteg et al. 2005) and intracranial EEG recordings (Zumsteg et al. 2006) have validated LORETA as an efficient and reliable tool to study brain activity. Compared to the first version of LORETA (Pascual-Marqui et al. 1994), the most recent version, namely, eLORETA has no localisation bias in the presence of structured noise (Pascual-Marqui 2007).
In eLORETA, a three-shell spherical head model (brain, scalp and skull compartments) is assumed and the solution space is restricted to the cortical grey-matter and the hippocampus. In total, the solution space comprises 6239 voxels of 5 Â 5Â5 mm each. The head model for computing the lead field is based on the Montreal Neurological Institute (MNI) brain MRI average (Pascual-Marqui et al. 1994). The CSD were calculated for the high frequency bands: beta1 (13-21 Hz), beta2 (21-30 Hz) and gamma (30-50 Hz).

LPS analyses
For spatially unbiased LPS analysis we defined ROIs based on the MNI coordinates of the cortical voxel underlying the 19 electrode sites (ROIs coordinates are given in Supplemental Table S1, available online). We used a single voxel for each ROI because eLORETA's spatial resolution is relatively low, and expanding the ROI to neighbouring voxels could potentially bias the analysis due to the high correlation among them (Canuet et al. 2012). Next, we computed the LPS between all 19 ROIs resulting in a relatively high number (i.e., 171) of pairwise combinations. LPS quantifies the non-linear relationship between 2 ROIs after the instantaneous zero-lag contribution has been removed. Removing this instantaneous zero-lag contribution has been shown to eliminate non-physiological artefacts, such as volume conduction, which biases relationship measurements such as instantaneous connectivity. The Euclidian distance between ROI1 (x1, y1, z1) and ROI2 (x2, y2, z2) were calculated using the Pythagorean theorem:ˇ[(x2Àx1)2 + (y2Ày1)2 + (z2Àz1)2] and were subsequently standardised into z-scores.
LPS computes the corrected phase synchrony value between signals in the frequency domain based on normalised Fourier transforms. It is therefore a measure of nonlinear functional connectivity. To reduce volume conduction and related artefacts, the instantaneous zero-lag contribution has been excluded from the total phase synchronisation yielding only lagged synchronisation. The classical total ''squared'' phase synchronisation, which is highly contaminated by the instantaneous artifactual component, is defined as: with: Where x k ! ð Þ and y k ! ð Þ correspond to the discrete Fourier transforms of the two signals of interest x and y at frequency ! for the kth EEG, Re [C] and Im[C] denote the real and imaginary parts of a complex number C; the latter explains the cycle of C; and the superscript ''*'', denotes a complex conjugate. The instantaneous (zerolag) connectivity component is closely related to the real part of the phase synchronisation. LPS, which statistically partials out the instantaneous component of the total connectivity, is defined as: In order to calculate the slope between LPS and distances, a linear model was created for all the 171 pairs, which included LPS values as dependent variable and the distance between each of the 19 ROIs as independent variable. Therefore, for each individual, three LPS values (beta1, beta2 and gamma) were extracted. These values correspond to the slope of the linear model which summarises the relationship between LPS at increasing distances between the 19 ROIs. These LPS values were then standardised before feeding them into the LASSO.

Defining the ROIs
For all analyses, we defined ROIs based on the MNI coordinates of the cortical voxel at 19 sites (Canuet et al. 2012) (Supplemental Table 1 available online ). For each ROI, we calculated activity at the centroid voxel. We did so as expanding to neighbouring voxels could potentially bias the analysis due to the potential correlation amongst them.

Prediction of transition to psychosis
All multivariate classification analyses were conducted using the R statistical environment (R Core Team 2014). As classification algorithm, we used the L1 regularised version of the logistic regression model, that is, the so called LASSO, as implemented in the R add-on package liblineaR (Helleputte 2013). We chose the LASSO because it performs variable selection and regression coefficient estimation simultaneously and thereby gives rise to models that are sparse and easy to interpret and at the same time still have very good predictive performance. The LASSO selects the most important variables by shrinking the regression coefficient of unimportant variables to zero. It has been demonstrated that the LASSO is more stable and accurate than traditional variable selection methods, such as backward elimination and best subset selection (Tibshirani 1996). Thus, it is highly suitable for highdimensional data problems (i.e., small event per variable ratio). Another advantage of the LASSO model is that it can easily be summarised by a regression function, whereas most other machine-learning methods, such as for instance support vector machines, lack an intuitive understanding and thus are much more difficult to communicate and validate.
To avoid optimistically biased estimates of performance and to protect against overfitting, we strictly separated the processes of training and testing the classifier. Specifically, we applied nested cross-validation with 10 folds and10 repetitions both in the inner and the outer loop using the R add-on package MLR (Bischl et al. 2015). The inner loop was used to search for the optimal tuning parameter lambda, whereas the outer loop was used to evaluate the predictive performance of the model. For tuning the model, we performed a grid search over a sequence of the 10 different values of lambda between 0.5 and 15. That is, for each value of lambda the cross-validated balanced accuracy (BAC) was estimated using 10-fold cross validation with 10 repetitions and the lambda value with the highest performance was picked. Since this was repeated at each iteration of the outer loop, the number of times a LASSO model was fitted amounted to 10 Â 10 Â 10 Â 10 Â 10 ¼ 100,000. To mitigate problems of class imbalance, we gave more weight to the ARMS-T class than to the ARMS-NT class during model fitting. Specifically, ARMS-T observations were given weights of 1.94, which is the number of ARMS-NT divided by the number of ARMS-T, and ARMS-NT observations were given weights of 1.
To investigate the contribution of each EEG modality (i.e., CSD and LPS), we trained and tested three different classifiers. The first was based on the 57 CSD measures (three frequencies at 19 ROIs), the second was based on the three LPS measures, and the third was based on CSD and LPS measures combined. For the latter, we applied a meta-learner that learned from the predictions of the CSD and LPS based learners. As classification algorithm for the meta-learner, the same method was applied as for the base learners (i.e., LASSO tuned with grid search and 10-fold cross-validation with 10 repetitions).
We restricted potential predictors to those frequencies consistently found to be altered in the resting-state psychosis literature. This procedure is in accordance with text books on clinical prediction modelling (Steyerberg 2009) which recommend to select candidate predictor variables based on the literature, especially in small samples.

Sample description
From March 2000 to August 2012 a total of 134 ARMS patients were recruited into the FePsy study. Of these, 53 ARMS had at least 15 min of EEG data, were antipsychotic-naïve and had sufficient follow-up data to be included in the present study. Eighteen of the included ARMS patients made a transition to psychosis (ARMS-T) during the follow-up and 35 did not (ARMS-NT). None of those who made a transition converted to affective psychosis. The 60 ARMS individuals that were excluded from this study did not differ from the included ARMS individuals with regard to age, gender, sex, years of education, and BPRS total and positive symptoms scores. The clinical characteristics and demographics of the ARMS-T and ARMS-NT groups are shown in Table I. The only overall difference in ARMS-T patients was a slightly higher positive symptoms score (P ¼ 0.035).

Prediction of transition to psychosis
The predictive performances in unseen test cases of the classifiers based on CSD, LPS, and both combined (stacked learner) are summarised in Table II and their corresponding receiver operating characteristic (ROC) curves are displayed in Figure 1. The best predictive performance in terms of AUC was achieved by the stacked learner (AUC ¼ 0.78), followed by the CSD alone (AUC ¼ 0.77) and LPS alone (AUC ¼ 0.56). For all classifiers, performances were much higher in the training than in the testing data sets (Supplemental Table 2 and Figure S1 available online).
The LASSO regularisation paths for the CSD and LPS classifiers, which show the size of the regression coefficients at different values of the shrinkage parameter lambda, are shown in Figure 2. The contribution of each CSD measurement in the tuned CSD classifier is displayed in Figure 3. Twenty-one out of 57 predictor variables had non-zero regression coefficients and thus contributed to the prediction of psychosis. Nine, six and six non-zero coefficients belonged to the gamma, beta1 and beta2 oscillations, respectively. In the gamma band, the three highest contributors were the left inferior parietal lobule (IPL) (b ¼ 3.34), the precuneus (b ¼ -3.16) and the right posterior temporal cortex (PPC) (b ¼ -2.44). In the beta1 band, the highest contributors were the left superior temporal gyrus (STG) (b ¼ 3.79) followed by the precuneus (b ¼ -3.29) and the right STG (b ¼ -2.04). In the beta2 band, the three highest contributors were the left IPL (b ¼ 2.30), the left superior frontal gyrus (b ¼ 2.12) and the right frontopolar cortex (b ¼ -1.76). In the tuned LPS classifier, beta1 contributed the most to the prediction of psychosis (b ¼ 0.62) followed by beta2 (b ¼ -0.33) and gamma (b ¼ 0.25).

Discussion
The main purpose of this study was to investigate whether neurophysiological measurements could help to predict the clinical outcome of patients at-risk for psychosis. In particular, we assessed whether CSD distribution and LPS of neural oscillations across various brain areas could be predictive of a transition to psychosis. This was achieved by submitting CSD and LPS values to the LASSO machine-learning algorithm to identify multivariate patterns of brain activity that predict transition. The model was internally validated using nested 10-fold cross-validation with 10 repetitions to allow honest estimation of the generalisation capacity of the prediction model. In ARMS patients, transition to    psychosis could be predicted with good accuracy from CSD but not from the spatial slope of LPS data. Combining both measures did not improve the predictive accuracy relative to a model that was based on CSD alone. Since ARMS-T and ARMS-NT could not be differentiated in terms of CSD using an univariate approach in our previous study, the findings of this study suggest that whole patterns of CSD have to be taken into account to successfully differentiate these groups. The present study reveals that CSD activity in the left STG, and to a lower extent the right STG in all frequency bands, are important for predicting transition to psychosis. This is in line with previous structural MRI studies showing that predominantly the left STG greymatter volume is significantly decreased in schizophrenic psychoses (Kasai et al. 2003) and that a decrease in both the left and right STG at baseline, i.e., during the at-risk state, is associated with a later transition to psychosis (Fusar-Poli et al. 2011a). This decrease in greymatter volume could be the cause of abnormal high frequency oscillations identified in the present study (Uhlhaas and Singer 2010). The next important ROI identified in our model is the left IPL, whose CSD in both the beta2 and gamma frequency bands are substantially predictive for transition to psychosis. The IPL is a complex brain region involved in attention, time and space integration (Assmus et al. 2003), language, and action processing (Caspers et al. 2013). The IPL has been shown to be a prime candidate in the schizophrenia network disorder and belongs to the cortical regions most affected by disease progression (Torrey 2007). In line with this finding, while an overall decrease in grey-matter volume in IPL has been associated with increased symptoms severity (Wilke et al. 2001), a decrease in the left IPL has mostly been revealed in male patients (Frederikse et al. 2000). These results suggest that patients prone to a later transition could already have abnormal IPL volumes, causing aberrant CSD generation specifically within this cortical region.
Finally, the LASSO algorithm has also identified beta1 oscillations within the precuneus as important predictors. The precuneus is a crucial part of the defaultmode network (see Gusnard and Raichle 2001 for review) and has been implicated in a broad spectrum of integrative processes such as self-consciousness, visuospatial imagery and social cognition (Cavanna and Trimble 2006). Interestingly, all these processes are known to be impaired in schizophrenic psychoses (Kuhn and Gallinat 2013), which fits well with the hypoactivation and reduction of the precuneus observed in schizophrenic psychoses (Shapleske et al. 2002). Most importantly, grey-matter volumes of the precuneus has also been found to be reduced in ARMS patients with later transition to psychosis , potentially explaining the here revealed CSD abnormalities of beta1 oscillations in converters.
The three predictors identified in our model could be cortical areas belonging to a particular network already impaired at the risk-state. Interestingly, converging evidence has revealed that the STG (Salisbury et al. 1998), the IPL (Fusar-Poli et al. 2011b) and the precuneus (Mulert et al. 2004) are all important areas for the generation of the P300 which is an event-related potential component elicited during stimulus evaluation and/or categorisation (van Tricht et al. 2010). Therefore, an alteration of this network could potentially explain why ARMS patients have been shown to have an altered P300 component, a promising biomarker in predicting the progression to full-blown psychosis (van Tricht et al. 2010;van Tricht et al. 2014).
Our study also highlights the importance of internal validation performed to prevent overoptimistic estimates of predictive performance. If we had not crossvalidated our model, we would have revealed a near perfect classification with an AUC of 0.99, which, after going through rigorous repeated cross-validations, was decreased to 0.77 (Supplemental Figure S1, training and testing for the CSD analyses, respectively, available online). Unfortunately, in the field of prediction of psychosis, most studies have not applied internal or external cross-validation and therefore are subject to over-optimism (Shah et al. 2013). Furthermore, many of those who did internally cross-validate their results did not do it in line with current recommendations (Steyerberg 2009). That is, they only cross-validated the final model and thus did not take into account the uncertainty introduced by the variable selection.
In many early detection centres for psychosis, restingstate clinical EEGs are now routinely used in the clinical diagnosis of patients exhibiting schizophrenia-like symptoms as a way to search for signs of organic brain disorders such as limbic encephalitis or epilepsy. Moreover, it is relatively easy to place without the need of an advanced degree and only about 15 min of eye-closed acquisition is needed. Automated software could be programmed to perform decent EEG datapreprocessing which would be fed into the model. A prediction score could then be obtained in less than an hour. The latter could be helpful in clarifying the differential diagnosis and in determining the prognosis.

Limitations
It is important to note that -relative to the number of considered predictors -the effective sample size is relatively low. However, it should be noted that ARMS patients are a very difficult to recruit patient population because: (1) these patients are relatively rare, (2) many of them only seek help when they have already developed frank psychosis, and (3) even if they visit our early detection clinic early enough, they often cannot be motivated to participate in scientific studies because they are often already quite suspicious due to the onset of the disease. Due to the small event per variable ratio, we took extra care to prevent over-fitting by conducting repeated nested cross-validation. Nevertheless, our results should be considered preliminary and be replicated in bigger samples. Furthermore, we relied on a low-density EEG system which is commonly used in the clinical field for practical reasons. Although several recent studies have shown that resting-state analyses could reliably be performed using such systems (Babiloni et al. 2013;Canuet et al. 2011Canuet et al. ,2012, all analyses would have been more precise with a greater number of electrodes. Moreover, some patients across both the ARMS-T and ARMS-NT groups relied on medications other than neuroleptics, which could have influenced the recorded brain activity.

Conclusion
These findings provide preliminary evidence that CSD measurements of high frequency oscillations could be used as predictors for the early detection of psychosis.
The main ROIs identified in our model are all important cortical areas in the generation of the P300 ERP component which has been found to be an important predictor of psychosis (van Tricht et al. 2010). To our knowledge, this is the first study to investigate the high frequencies present at numerous ROIs distributed across the brain using powerful neurophysiological techniques. All patients were neuroleptic-naïve and all data were acquired using the widely available low resolution clinical EEG equipment. Moreover, our model was validated using repeated cross-validations which have yielded good internal validation; a step beyond previous EEG studies in the field of early detection.