Integrated Depths for Partially Observed Functional Data

Abstract Partially observed functional data are frequently encountered in applications and are the object of an increasing interest by the literature. We here address the problem of measuring the centrality of a datum in a partially observed functional sample. We propose an integrated functional depth for partially observed functional data, dealing with the very challenging case where partial observability can occur systematically on any observation of the functional dataset. In particular, differently from many techniques for partially observed functional data, we do not request that some functional datum is fully observed, nor we require that a common domain exist, where all of the functional data are recorded. Because of this, our proposal can also be used in those frequent situations where reconstructions methods and other techniques for partially observed functional data are inapplicable. By means of simulation studies, we demonstrate the very good performances of the proposed depth on finite samples. Our proposal enables the use of benchmark methods based on depths, originally introduced for fully observed data, in the case of partially observed functional data. This includes the functional boxplot, the outliergram and the depth versus depth classifiers. We illustrate our proposal on two case studies, the first concerning a problem of outlier detection in German electricity supply functions, the second regarding a classification problem with data obtained from medical imaging. Supplementary materials for this article are available online.


Introduction
In this article we propose a depth measure for partially observed functional data.
Starting from the pioneering work of Fraiman and Muniz (2001), different definitions of depths for functional data have been proposed (see, e.g., Fraiman and Muniz 2001;Romo 2009, 2011;Ieva and Paganoni 2013;Claeskens et al. 2014) and their theoretical properties have been thoroughly studied (see, e.g., Nieto-Reyes and Battey 2016; Nagy et al. 2016;Gijbels and Nagy 2017). Moreover, depths have shown to be remarkable tools for the visualization of functional data (Hyndman and Shang 2010;Genton 2011, 2012;Genton et al. 2014;Dai and Genton 2018), for functional outlier detection (Arribas-Gil and Romo 2014; Nagy, Gijbels, and Hlubinka 2017;Ieva and Paganoni 2017) and for the classification of functional data (Li, Cuesta-Albertos, and Liu 2012;Hubert et al. 2017). On the other hand, the presence of partially observed data hampers the applicability of such methods.
For a random sample of functions from a compact interval [a, b] to R, partially observed data refers to the case where the records of the functions are available only on subsets of [a, b], so that only fragments of each functional datum are available. This type of data (also referred to as incomplete or fragmented functional data, as well as functional snippets) are indeed common in real applications, and have been reported in several areas of research. In medical studies, for instance, typical sources of censoring of the data are patients missing medical visits or devices failing to record (see, e.g., James, Hastie, and Sugar 2000;James and Hastie 2001;Sangalli et al. 2009;Delaigle and Hall 2013;Kraus 2015;Delaigle and Hall 2016;Lin, Wang, and Zhong 2020). In demography it is common that age-specific mortality rates for older ages are not completely observed due the too low number of survivors (see, e.g., Human Mortality Database 2019; D' Amato et al. 2011). In electricity markets, supply functions are incomplete because suppliers and buyers agree prices and quantities depending on the market conditions (see, e.g., Kneip and Liebl 2020;Liebl and Rameseder 2019). Likewise in the case of sparse functional data, where the datum is assumed to be recorded in a sparse way across the whole domain, also in the case of partially observed functional data, the missing data structure raises difficulties for the data analysis. This has stimulated a very active literature, and sparse or partial observability in functional data have been addressed in several respects, including functional principal component analysis (see, e.g., James, Hastie, and Sugar 2000;Yao, Muller, and Wang 2005;Di, Crainiceanu, and Jank 2014;Liu, Ray, and Hooker 2017), mean and covariance estimation (see, e.g., Kraus 2015;Liebl and Rameseder 2019;Lin, Wang, and Zhong 2020), estimation of the missing parts (see, e.g., Goldberg, Ritov, and Mandelbaum 2014;Kraus 2015;Delaigle and Hall 2016;Kneip and Liebl 2020) and supervised and unsupervised classification (see, e.g., James and Hastie 2001;Delaigle and Hall 2013;Stefanucci, Sangalli, and Brutti 2018;Kraus and Stefanucci 2019).
The introduction of a suitable notion of depth measure for partially observed functional data contributes to the literature in two ways. First of all, it expands the palette of tools that are available in such complex data setting, introducing a key concept of robust statistics. Moreover, it offers a crucial support to the other techniques for partially observed functional data cited before. In fact, such inferential techniques can be highly affected by the presence of outliers. Up to this moment, in partially observed functional data settings, authors had to rely on visual inspections or expert knowledge for detecting outliers and avoiding biased estimations (see, e.g., Liebl and Rameseder 2019;Kneip and Liebl 2020); even so, only obvious magnitude outliers can be visually detected (see, e.g., Hyndman and Shang 2010;Sun and Genton 2011;Arribas-Gil and Romo 2014;Nagy, Gijbels, and Hlubinka 2017). Depth measures are cornerstones for outliers detection and our proposal can thus, provide an objective approach in this respect. In particular, the proposed depth measure enables the construction of the functional boxplot (Sun and Genton 2011) and the outliergram (Arribas-Gil and Romo 2014) in the context of partially observed functional data. This also offers crucial visualization tools for partially observed functional data. Moreover, the proposed depth permits the use of Depth versus Depth classifiers (Li, Cuesta-Albertos, and Liu 2012;Cuesta-Albertos et al. 2017) for partially observed functional data, offering an important alternative to the classification methods so far proposed in this context.
It is important to point out that we here deal with the very challenging case where partial observability can occur systematically on any observation of the functional dataset. In particular, differently from many of the above cited techniques for partially observed functional data, we neither request that some functional datum is fully observed, nor we require that a common domain exist, where all of the functional data are recorded. Because of this, our proposal can also be used in those frequent situations where reconstructions methods and other techniques for partially observed functional data are inapplicable.
The class of depths we define belongs to the integrated depth measures family (Nagy et al. 2016), that includes the Fraiman and Muniz Depth (Fraiman and Muniz 2001) and the Modified Band Depth (López-Pintado and Romo 2009). We account for the uncertainty related with the unobserved fragments by means of an appropriate weight function, that gives more weight in the computation of the depth to regions with high observational density and less weight to regions with low observational density. We name the proposed method Partially Observed Integrated Functional Depth (POIFD).
Sguera and López-Pintado (2020) have very recently proposed a depth measure for sparse functional data. Their proposal leverages on reconstruction of the sparsely observed functional data, using the approach in Goldsmith, Greven, and Crainiceanu (2013), and is in turn used by Qu and Genton (2022) to construct a boxplot for sparsely observed functional data.
Differently from Sguera and López-Pintado (2020), our proposed POIFD does not involve reconstruction of the data, using any of the reconstruction techniques for sparse data or partially observed functional data. Through extensive simulations studies, we show that the proposed POIFD works better than depth computed on reconstructed data, considering best-ofthe-art reconstruction techniques for sparse functional data (such as Goldsmith, Greven, and Crainiceanu 2013) as well as for partially observed functional data (Liebl and Rameseder such as 2019;Kneip and Liebl such as 2020). These simulation studies are carried out under different missing data scenarios with different percentages of observability. Furthermore, our proposed POIFD can also be applied in all the situations where reconstruction methods are inefficient (for instance, in presence of outliers) or inapplicable.
In Section 2, we introduce the proposed depth for partially observed functional data and its sample version. For simplicity, we first describe the depth in the case of univariate functional data, and hence, extend it to multivariate functional data.
In Section 3, we report simulation studies that highlight the very good performances of the proposed POIFD. In particular, we demonstrate the high agreement between the POIFD and the depth that could be computed if the functional data was fully observed. Moreover, considering the special case where the reconstruction techniques for partially observed functional data are applicable (i.e., when at least some of the curves are fully observed), we show that POIFD performs significantly better than the standard functional depths computed on the full domain after reconstruction of the data. Furthermore, we show that, in the special case where there is a common domain where all of the curves are recorded, the proposed POIFD performs significantly better than the standard functional depths restricted to the common domain.
In Section 4, we illustrate the use of the proposed POIFD in two challenging case studies. The first concerns outlier detection in German electricity supply functions (Liebl 2019;Kneip and Liebl 2020). The data are shown in the top left panel of Figure 1; the bottom left panel of the same figure displays the proportion of observed data at each point of the domain, highlighting that the missing portions are spread along all the domain. The second case study involves the AneuRisk65 dataset (see, e.g., Sangalli et al. 2009;Sangalli, Secchi, and Vantini 2014b) and has to do with a discrimination problem on data obtained from reconstruction of medical images. The data are shown in the top right panel of Figure 1. The bottom right panel of the same figure shows that for AneuRisk65 data there exists a small portion of the domain, named common domain, where all the functional data are observed.
The supplementary materials carries additional evidence of the good performances of the proposed POIFD. We there also consider the challenging scenario where none of the functional data has been recorded over the whole domain, in which case the reconstruction methods for partially observed functional data are inapplicable. Following Claeskens et al. (2014), we define a functional depth starting from a finite dimensional (univariate) depth. Let P denote the collection of all probability measures on (R, B(R)), where B(R) is the Borel σ -algebra on R. Consider a function D : R × P → [0, 1]. We assume such function satisfies properties D 1 to D 7 from Nagy et al. (2016). Without loss of generality we consider functional data defined over the interval [0, 1]. In particular, hereinafter, X : [0, 1] → R is a stochastic process with continuous trajectories, P is the law of X, and P t is the marginal probability distribution of X(t).

Depth for Partially Observed Functional Data
We recall the definition of integrated functional depth (Claeskens et al. 2014;Nagy et al. 2016).

Partially Observed Integrated Functional Depth
Let X 1 , . . . , X n be n independent realizations of X. We consider the case when the realizations X 1 , . . . , X n are only partially observed. To model such setting, similarly to Delaigle and Hall (2013), we consider a random observational mechanism Q, that generates compact subsets of [0, 1], over which the functional data are observed. In particular, we assume Q generates compacts that consist of finite unions of closed intervals with strictly positive Lebesgue measure. Let O be one such set generated by Q, and let O 1 , . . . , O n be independent copies of O. Then, for 1 ≤ i ≤ n, the functional datum X i , is only observed on O i . We assume that P and Q are independent, and that (X 1 , O 1 ), . . . , (X n , O n ) are iid realizations from P × Q. This assumption, termed Missing-Completely-at-Random, is standard in the literature of partially observed functional data (see, e.g., Kraus 2015; Kneip and Liebl 2020).
Since X is only observed on a compact set O, we define its depth by restricting an integrated functional depth to O. To this aim, for t ∈ [0, 1], let Q(t) = P(O t), that is, the probability that the random set O covers the point t. Without loss of generality, we assume that Q(t) > 0 for all t ∈ [0, 1]. Moreover, we consider a bounded and a continuous function φ defined on [0, 1] and such that O φ(Q(t))dt > 0 almost surely; such function φ can for instance be the identity function on [0, 1]. We then define the following weighting function restricted to a compact set O: We now define the proposed Partially Observed Integrated Functional Depth for any continuous function x : where w φ (t|O) is the weight function in (1).
If the data are fully observed, that is, if P(O t) = 1 for any t ∈ [0, 1], then the proposed POIFD coincides with the IFD with constant weight function w(t) = 1 for all t ∈ [0, 1].

Sample Version
We now introduce the empirical version of POIFD, for its computation on finite samples. Denote by P n the distribution that assigns mass 1/n to each sample curve X 1 , . . . , X n . Similarly, let Q n be the distribution that assigns mass 1/n to each sample We define the empirical version of POIFD using the plug-in approach. Namely, In practice, we only have access to discrete versions of the functional data X i , on a discrete evaluation grid {t 1 , . . . , t T }, with 0 = t 1 < t 2 < · · · < t T = 1, which for simplicity we assume to be common across data and equispaced. 1 Note that for many t , = 1, . . . , T, such evaluation may indeed be missing, in conformity with the partially observed nature of the data considered in this work. We can then define the sample version of the POIFD by using a standard Riemann approximation.
Next section extends the proposed depth to multivariate functional data.

Multivariate Partially Observed Integrated Functional Depth
The partially observed integrated functional depth in Definition 2.2 can be straightforwardly extended to multivariate functional data. In this case, we consider a K-dimensional stochastic process X = (X (1) , . . . , X (K) ) with law P = (P (1) , . . . , P (K) ), where each coordinate X (k) , 1 ≤ k ≤ K, is a continuous function on [0, 1], and P (k) is its law. We moreover consider a multivariate observational process Q = (Q (1) , . . . , Q (K) ), that generates the compact sets allowing for different domains of observation along the different components. We assume that P and Q are independent and that (X 1 , O 1 ), . . . , (X n , O n ) are iid realizations from P × Q.
A multivariate POIFD, inspired by the proposal by Ieva and Paganoni (2013), can then be defined as a weighted average of K univariate POIFDs: (4) where the nonnegative weights α 1 , . . . , α K sum to 1. The choice of these weights is problem driven. This definition is very flexible. As shown for instance in Section 4.2, the selection of the weights by cross-validation, in a classification problem, enables the construction of an accurate classifier, and offers a posteriori valuable information for the interpretation of the results.
It is also possible to consider the special case where the observational process is univariate, and select a compact subset of [0, 1] where X is observed, so that all K components {X (1) , . . . , X (K) } of X are observed over the same portion of the domain. We thus, assume that (X 1 , O 1 ), . . . , (X n , O n ) are iid realizations from P × Q, where P and Q are independent. In this special case, alternatively to the definition above, following the approach by Claeskens et al. (2014), we can consider a depth measure D on R K , and define the multivariate POIFD where P t is the marginal law of X(t).

Simulation Studies
In this section we illustrate the performances of the proposed POIFD, showing its superiority to any alternative, when the alternatives are applicable. In the definition of the weight of the POIFD, we use φ(q) = q. We consider both the partially observed version of the Fraiman and Muniz depth (FM) (Fraiman and Muniz 2001)

Data Generation
We generate data as follows. We let P be the law of a Gaussian process. In particular, we sample from the model is a centered Gaussian process with covariance ρ μ (s, t) = βe −γ (sin(π |s−t|) 2 /ζ 2 , for any s, t ∈ [0, 1], and (t) is a centered Gaussian process with covariance ρ (s, t) = ηe −λ|s−t| . In particular, when performing N simulation repetitions, we generate N samples of dimension n, where the sample is {X j1 , . . . , X jn }, for j = 1, . . . , N, and X ji = μ j (t) + ji (t). With this generating process, each random sample presents a different mean pattern μ j , thus, avoiding shape related biases. The set of parameters for all the results presented in the article is β = 3, γ = 2, ζ = 0.5, η = 0.5, and λ = 5. We carried out simulations for other values of these parameters (nonincluded here for sake of space), always reaching the same conclusions.
As for the observation process Q, we consider the following patterns of observability, where p is the (expected) proportion of observability over the evaluation grid.
Sparse functional data: Each function is observed at p × T grid points randomly taken from {t 1 , . . . , t T }. Partially observed functional data on random intervals: Each function is observed on m disjoint intervals, spread along [0, 1], with a total expected observed proportion p. To do so, for each function, we generate a random sample of size (m − p)/p from a uniform distribution on [0, 1]. Then, we consider the intervals [u(i − 1), u(i)], where u(i) is the ith order statistics of the sample. The function is hence, observed on m of these intervals, chosen at random, but guaranteeing that they are disjoint. Partially observed functional data with common domain: All functions are completely observed on a subset of the domain, named common domain, that is for simplicity centered in 0.5. This common domain is ∩ n i=1 O i . Specifically, each functional datum is observed on a domain that goes from a starting point to an ending point that are randomly generated. The starting point is sampled from a Uniform distribution on [1/2 − p, 1/2), if p ≤ 1/2, and from Uniform distribution on [0, 1 − p), if p > 1/2. The ending point is sampled from a Uniform distribution on (1/2, 1/2 + p], if p ≤ 1/2, and from Uniform distribution on (p, 1], if p > 1/2. This leads to a total (expected) proportion of observation p. Figure 2 shows the same realizations of P, but partially observed under the two settings of Q, random interval and common domain, with p = 50%. In both panels of the figure, one random functional datum is plotted in blue. The bottom panel displays the proportion q n of observed functions.

Comparison with Competing Methods
We compare the proposed POIFD to standard depths computed on data that have been reconstructed using some of the most recent reconstruction methods for sparse and partially observed functional data, including Goldsmith, Greven, and Crainiceanu (2013), Kraus (2015), and Kneip and Liebl (2020). The method in Goldsmith, Greven, and Crainiceanu (2013) also constitutes the base for the depth for sparse functional data very recently proposed in Sguera and López-Pintado (2020), which we do not include here because of the unavailability of a publicly available code for this method. Our proposed POIFD is for partially observed functional data; nevertheless, its empirical version can also be computed on sparse functional data. For this reason, we also include the case of sparse functional data in our simulation studies.
It is important to point out that the methods for partially observed data, such as Kraus (2015) and Kneip and Liebl (2020), can estimate consistently the covariance only when there exist functions completely observed over the total domain [0, 1]. For this reason, we here let some of the data be observed on the whole evaluation grid. In Section 7.2 of the supplementary materials, we instead show that our proposed POIFD has very good performances also in the more challenging scenario where no datum is fully observed. Specifically, we generate N = 100 samples, each consisting of n = 100 functional data X j1 , . . . , X j n , obtained from the process P described above. For each sample, we let 25% randomly chosen functional data to be observed over the whole evaluation grid. The remaining 75% of the functional data are only partially observed, sampling the observation patterns from one of the observational processes Q described above.
We then compute the following alternative depths: -only in the Common Domain case, the depths restricted to the common domain; -the depth computed after reconstruction of the missing portions with Goldsmith, Greven, and Crainiceanu (2013) with Kraus (2015) and with Kneip and Liebl (2020); -the proposed POIFD.
To evaluate the goodness of these alternatives, we also compute the depths for the completely observed functional data by IFD with constant weight function w(t) = 1 (that coincides with the POIFD for fully observed data). We then measure the agreement between the depths on completely observed data and the  (2013), with Kneip and Liebl (2020) and with Kraus (2015); the proposed POIFD. The results are obtained considering the Fraiman and Muniz depth (FM) and the Modified Band Depth (MBD). In each setting, to enable comparison with depth computed on reconstructed data, 25% of functional data in the sample are completely observed, while for the remaining 75% there is an average observation proportion p = 50% for each functional datum. . Functional boxplot and outliergram for partially observed functional data. Common Domain case (p = 50%), with addition of two magnitude outliers (yellow in color printing/light gray in black and white printing) and two shape outliers (green in color printing/dark gray in black and white printing). Left panel: functional boxplot for partially observed functional data; its elements are colored in a gray scale that depends on the proportion q n of observed data; the median is highlighted as a thick white curve. Right panel: outliergram for partially observed functional data. depths on the partially observed data, by Spearman correlation. Figure 3 shows the boxplots of Spearman correlations across the N = 100 replicates, under the various observation processes, with p = 50%. Under all scenarios, the proposed POIFD attains higher correlations and lower dispersion than any other option considered. Table 1 in the supplementary materials reports the mean correlations under all settings considered, with p = 50%, 25%, 10%, and m = 1, 4 for the random interval case. The proposed POIFD is superior to any alternative in all cases. The same conclusions are reached when the agreement is measured by Pearson correlation and by Willmott index (Duveiller, Fasbender, and Meroni 2016), that returns the maximum value of 1 only if the relationship between the two vectors is not simply linear but is exactly the identity. The proposed POIFD is also computationally way more efficient than any of the other considered methods. For instance, for all the simulations in Figure 4, the mean computing time in seconds is: Goldsmith, Greven, and Crainiceanu (2013): 12.0068, Kraus (2015): 8.7625, and Kneip and Liebl (2020): 37.3021; POIFD: 0.1164.

Outliers, Boxplot and Outliergram for Partially Observed Functional Data
The presence of outliers may affect any statistical analysis. This is especially the case when the data are high dimensional such as for functional data (Locantore et al. 1999). In this context, fundamental statistics like the mean and the covariance are very sensitive to outliers and become unreliable, and this in turns may negatively affect analyses such as principal component analysis (see, e.g., Hubert et al. 2005). A great number of outlier detection methods for functional data have been introduced, and many of them leverage on depth measures (see, e.g., Febrero-Bande, Galeano, and González-Manteiga 2008; Hyndman and Shang 2010; Sun and Genton 2011;Arribas-Gil and Romo 2014;Narisetty and Nair 2015;Nagy, Gijbels, and Hlubinka 2017). Thanks to the proposed POIFD, these techniques can now be applied to partially observed functional data. In particular, we here consider an extension of the functional boxplot (Sun and Genton 2011) and the outliergram (Arribas-Gil and Romo 2014) to partially observed functional data; these two methods enable the detection of magnitude and shape outliers and also constitutes two insightful visualization tools.
To illustrate the proposal, we contaminate the Gaussian samples in Section 3.1 with shape and magnitude outliers (Dai et al. 2020, see). Following the literature, magnitude outliers are obtained from X Mag Unif[2.5,4] with probability 1/2, and u ji ∼ Unif[−4, −2.5] otherwise. Shape outliers are generated from X Shape ji (t) = μ j (t) + μ j (t) + ji (t) whereμ has covariance ρ μ (s, t) with parameters β = 2 and ζ = 0.05; thus, X Shape (t) is a function with the same mean as X(t), but rougher. Figure 4 illustrates both outlier detection tools (Common Domain case, p = 50%). For visualization reasons, we here include only two magnitude outliers, in yellow (light gray in black and white printing) and two shape outliers in green (dark gray in black and white printing). The left panel shows the functional boxplot for partially observed data. We here enrich the functional boxplot (Sun and Genton 2011) by coloring its elements by a gray scale that depends on the proportion q n (t) of observed functional data at each time point; the median is highlighted in white. The central region contains the 50% of the data with the highest POIFDs; the whiskers are constructed considering the functional inter-quantile range inflated by 1.5. Notice that the two magnitude outliers fall partly outside of the whiskers and, thus, they are correctly detected as magnitude outliers. Shape outliers may instead be hidden in the middle of the data, and might thus, be missed by the functional boxplots.
The outliergram for partially observed functional data is shown in the right panel of Figure 4. This is based on the quadratic relationship between the partially observed version of the Modified Band Depth and the partially observed version of the Modified Epigraph Index; such quadratic relationship is shown in the graph as a solid parabola. The dotted parabola delimits the shape outlier region, and is built by inflating by 1.5 the solid parabola. The shape outliers are clearly unmasked.
Finally, Figure 5 is like Figure 3, but we have here perturbed samples with outliers. In particular, each of the N = 100 samples is composed by a total of n = 112 functional data, of which 100 data are generated as X in Section 3.1 while 12 data are outliers, generated as X Mag or X Shape above, with probability 0.5 each. Likewise in Section 3.2, to enable comparison of the proposed POIFD with depths computed on reconstructed data, we let 25% of the data be observed over the whole evaluation grid, while the remaining 75% are only partially observed. Figure 6 displays the proportion of outliers correctly identified by functional boxplot and outliergram, when computing the depth for sparse and partially observed functional data using the proposed POIFD or using standard depth after reconstruction with Goldsmith, Greven, and Crainiceanu (2013), with Kneip and Liebl (2020) and with Kraus (2015). The proportion of correctly identified outliers is significantly higher when using POIFD rather than any alternative.
We can also measure the improvement in the reconstruction methods proposed in Kraus (2015) and Kneip and Liebl (2020), when the outliers detected on the basis of the proposed POIFD are removed. In particular, we consider the mean square error on each functional datum, comparing the reconstructed functional datum with the completely observed datum (excluding from the computation of the mean square error the 25% of the functional data that are fully observed). In the Random Interval case, with m = 4, p = 50%, removal of the outliers detected by the proposed POIFD improves the estimation (in average over the N = 100 repetitions) by 29.68% when using the technique proposed by Goldsmith, Greven, and Crainiceanu (2013), by Figure 5. Same as Figure 3, but with outliers included in the dataset. Figure 6. Outlier detection. Proportion of correctly identified outliers by functional boxplot and outliergram, when computing the depth for sparse and partially observed functional data using the proposed POIFD or using standard depth after reconstruction with Goldsmith, Greven, and Crainiceanu (2013), with Kneip and Liebl (2020), and with Kraus (2015).
48.84% when using the technique by Kraus (2015) and by 63.67% when using the technique by Kneip and Liebl (2020). Analogous results are obtained in all other cases, highlighting the importance of proposed POIFD as a crucial support to other techniques for partially observed data.

Outlier Detection in German Electricity Supply Functions
The range of observability of the electricity price and its supplied quantity changes from one day to another, so that the prices as functions of the quantity are partially observed functional data. We consider here the dataset analyzed in Kneip and Liebl (2020) and Liebl (2019); available at supplemental materials A of Liebl (2019). Kneip and Liebl (2020) and Liebl (2019) deal with the problem of reconstructing the unobserved parts and testing the difference in prices before and after the Germany's nuclear phaseout. To this aim, the authors discard functions corresponding to weekends and holidays as being potential outliers. In addition, they propose to delete any function with prices or supplied quantity greater than a given threshold.
Here we consider the proposed POIFD, and we use the functional boxplot for partially observed functional data and the outliergram based on the proposed partially observed modified epigraph index, to detect magnitude and shape outliers.
The upper left panel of Figure 1 shows the rainbow plot (Hyndman and Shang 2010) of the daily electricity supply functions, from March 15th, 2010 to March 14th, 2012. Each daily function is observed at 24 discretization points, one per each hour of the day. Liebl (2019) analyzes only peak hours (9am-8pm) of the working days from March 15, 2012to March 14, 2013 It should be pointed out that, from 1998, Germany has driven the phase out of the nuclear energy. However, in September 2010 the country reached a new agreement for extending the operating lives of the German nuclear power plants. This new agreement ended up with the 11th Law Amending the Atomic Energy Act of December 8th, 2010. This period corresponds to the blue-green functions with demand values higher than 75 KMWh. In March 11th, 2011, the Fukushima Daiichi nuclear disaster made Germany react with a moratorium (March 15th, 2011) and shut down 40% of its nuclear power plants (period corresponding to green-yellow functions). This policy has affected both electricity prices and supply. The bottom left panel of Figure 1 plots the proportion q n of observed data, showing low observability along all supply values (with maximum observability 0.31).
The bottom panel of Figure 7 shows the outliers detected by the functional boxplot and outliergram, based on the proposed POIFD. The outliers are 59 out of the 729 data. Among the 59 outliers, 40 correspond to working days and 19 to nonworking days (weekends and holidays); these are plotted in red (light gray in black and white printing) and blue (dark gray in black and white printing), respectively. The top panel of Figure 7 shows the chronology of the outliers. Remarkably, more than a half of the outliers are located between November 2010 and February of 2011, coinciding with a stance shift of the Government which regards German nuclear policy. This suggests that during the days preceding and following the 11th Atomic Energy Act Amendment (blue-green spectrum functions) the electricity market reacted with a shift of the supply curve to the right due to an expected increase in the quantity produced in the following years. Our approach also automatically identifies as outliers the functional data with peaks above the threshold of 200 Euro/MWh, that are discarded by Liebl (2019). In addition to the functional data that exceed this threshold, other functions close in time are also flagged as outliers.

Classification of AneuRisk65 Data
The AneuRisk65 dataset collects the morphology of the inner carotid arteries of 65 subjects. The data are obtained from the reconstruction of three-dimensional angiographic images (see, e.g., Sangalli et al. 2009;Sangalli, Secchi, and Vantini 2014b) and are publicly available at https://statistics.mox.polimi.it/aneurisk/. The data have been collected and analyzed with the aim of evaluating the role of vascular geometry and hemodynamics on the pathogenesis of cerebral aneurysms. One statistical challenge concerning the analysis of these data is the discrimination of subjects depending on the presence and location of the cerebral aneurysms. In particular, Sangalli et al. (2010) and subsequent papers distinguish the subjects in two groups: the so-called Upper group, composed by those patients having the most dangerous aneurysms (that certainly are within the skull), and the so-called Lower group, composed by subjects without any visible aneurysm during the angiography or having less dangerous aneurysms (because possibly outside of the skull). Likewise previous studies, we focus on two geometrical features of the inner carotid artery, that highly influence the haemodynamics: its radius profile and its curvature profile (proxied by the profile of the curvature of the carotid centreline). The right panels of Figure 1 displays the data. These functional data are only partially observed. In fact, only a portion of the inner carotid artery of each patient is included in the angiography, with longer portions being available for certain subjects, and shorter for others, depending on where the angiographic image has been centered. In particular the mean proportion of observed data is 62.19%. The missing portions are located only at the extremes of the domain, similarly to the simulation setting with the Common Domain case in Section 3. Specifically, a common domain is available, where the data have been recorded for all statistical units; this corresponds to the portion of inner carotid artery that has been acquired in the angiographic image of all the 65 subjects. This common domain is indicated in the bottom right panel of Figure 1, and constitutes about 23% of the full domain.
The first studies on the AneuRisk65 data, that discriminate Upper group and Lower group subjects, are based on principal component scores and are restricted to the common domain where all data are recorded; see, e.g., Sangalli et al. (2009) and Sangalli, Secchi, and Vantini (2014a). Recently, Stefanucci, Sangalli, and Brutti (2018) shows that, by using a principal component method for partially observed functional data, it is possible to obtain a better discrimination than the one given in Sangalli et al. (2009) andVantini (2014a). In particular, Stefanucci, Sangalli, and Brutti (2018) considers progressive enlargements from the common domain to the full domain, and shows that the best discrimination result is obtained considering a domain that is larger than the common domain but smaller than the full domain. A similar idea is investigated in Kraus and Stefanucci (2019), where the authors though restrict their attention to the radius profiles only.
We here aim to discriminate between the two groups of subjects by Depth versus Depth (DD) classifiers (Cuesta-Albertos et al. 2017), based on the proposed POIFD. Since these functional data are bivariate we consider a bivariate POIFD. In particular, we use the weighted average of the univariate POIFDs for radius and curvature, according to the definition in Equation (4). This enables us to explore different weights (α, 1 − α) given to the two components, ranging from considering only the curvature (α = 0) to considering only the radius (α = 1). Moreover, we set where 0 ≤ q * ≤ 1 and q * is the minimum proportion of observed curves for which the weight function is nonnull. This allows us to explore different portions of the domain through different values of q * , ranging from the full domain (q * = 0) to the common domain (q * = 1). In particular, we select the optimal weight given to radius and curvature (i.e., the value of α) and the optimal domain (i.e., the value of q * ) by minimizing the misclassification error. This approach is very flexible and also highly informative. In fact, on one hand it offers insights on the relative importance of each component of the data; on the other hand, it permits to identify the portion of the domain where the discrimination between the two groups is more relevant. Specifically, we consider the Leave-1-out misclassification Error Rate (L1ER) for quadratic discriminant analysis on the DD-plot based on the proposed POIFD considering a modified band depth. The left panel of Figure 8 shows the L1ER corresponding to different weights on radius (α) and different portions of the domain (q * ). The misclassification error increases sharply for low values of the weight on radius, pointing out that the radius is determinant in the discrimination of the two groups of subjects. Moreover, this plot highlights that neither focusing the analysis on the full domain (q * = 0) nor restricting the analysis to the common domain (q * = 1) are optimal for discrimination purposes, supporting the findings in Stefanucci, Sangalli, and Brutti (2018). In particular, the minimum classification error is achieved for an equally weighted bivariate POIFD of radius and curvature, and for q * = 0.804, corresponding to the optimal domain displayed in the bottom right panel of Figure 1, where at least 80.4% of the functional data are observed.
The right panel of Figure 8 shows the DD-plot in correspondence of the optimal values of α and q * , together with the classification regions determined by quadratic discriminant analysis. The L1ER is 13.8%, tying with the best result obtained in Stefanucci, Sangalli, and Brutti (2018) with the domain selection procedure (compared to 21.54% in Sangalli et al. 2009), with an apparent error rate of 10.76% (compared to 15.38% in Sangalli et al. 2009). Table 3 in Section 8.2 in the supplementary materials reports the absolute, relative, and conditional confusion matrices according to L1ER and to APER: the misclassification errors are all lower than those found in the previous discrimination analyses performed on the AneuRisk65 data; see Sangalli et al. (2009), as well as Stefanucci, Sangalli, and Brutti (2018) and Kraus and Stefanucci (2019), that consider the domain selection.
Even if we restrict the analysis to the radius only, the discrimination results are better than those found so far by any of the previous analyses. In particular, the DD-classifier based on linear discriminant analysis of univariate POIFD with the Modified Band Depth provides a L1ER of 16.92% (compared to 23.20% in Kraus and Stefanucci 2019). The optimal value of q * is here 0.853, corresponding to almost the same domain found in the multivariate case, corroborating the importance of this region for classification purposes.
It should also be pointed out that our analysis, differently from previous analyses, does not require data alignment.

Discussion
This work has introduced the first notion of depth measure for partially observed functional data. The investigation of its theoretical theoretical properties will be the object of a future dedicated work. Through extensive simulation studies, we have shown the very good performances of the proposed POIFD and its superiority to alternative approaches already available in the literature, whenever the latter are applicable (i.e., in the simplified settings where at least some of the curves are fully observed, or where there is a common domain were all data are recorded). The functional boxplot and the outliergram based on the proposed POIFD are able to unmask both shape and amplitude outliers. This offers a fundamental support to the available techniques for partially observed functional data, which can be highly affected by outliers. Moreover, the proposed POIFD can efficiently be used in classification problems, as illustrated with the application to Aneurisk65 data, where a DD-classifier based on the proposed depth leads to a misclassification error lower than any of the other discrimination techniques ever applied to these challenging data.
We are certain that the proposed depth will prove highly valuable in a number of applications where partially observed functional data are encountered.

Supplementary Materials
Additional results: Extensive simulation results (Section 7). Additional output concerning the analysis of German electricity supply curves and AneuRisk65 dataset (Section 8). (AdditionalResults.pdf) R-package: R-package fdaPOIFD available at CRAN (Elías et al. 2021) includes the functions to compute the depth, plot the boxplot and the outliergram for partially observed functional data, as well as a vignette to reproduce the simulations.