Projection tests for linear hypothesis in the functional response model

Abstract This article concerns the linear hypothesis testing problem in the functional response model, which is one of the regression models considered in functional data analysis. In this model, the response is a function represented as a random process, while the predictors are random variables. To test the linear hypothesis, projection tests are constructed and theoretically justified. Namely, a kind of equivalence between the functional null hypothesis and its projected version is established. Different Gaussian processes and numbers of projections are considered in the implementation of new solutions. Moreover, as there is no one test having the best power for all correlation cases, a simple combining test is also proposed. It has satisfactory power in all cases. In simulation studies, the new tests are compared with existing methods in terms of size control and power. A real data example is also provided to illustrate the results.


Introduction
Functional data analysis (FDA) concerns univariate or multivariate data measured over time or space; for example, temperatures measured at different weather stations at a single time point, or temperatures measured at a single station every 10 minutes.Classically, such data are called doubly multivariate data.However, in FDA, they are appropriately represented by functions, curves, or surfaces, which helps avoid many problems in the analysis of such data.For example, the relation between the number of observations and the number of design time points at which the variables are measured (the curse of dimensionality), missing data, and correlation between observations for one subject cease to be a problem.Many classical methods of statistics have been adapted to functional data, and these are reviewed by: Ferraty and Vieu (2006), Horv ath and Kokoszka (2012), Kokoszka and Reimherr (2017), Ramsay and Silverman (2005), and Zhang (2013).An implementation of these methods is available in the R program (R Core Team 2021).The two main packages are fda (Ramsay, Graves, and Hooker 2020) and fda.usc (Febrero-Bande and Oviedo de la Fuente 2012).The supplement to G orecki and Smaga (2019) contains a review of R packages for FDA.These implementations have been used in many applications.
Regression analysis is one of the most widely applied statistical methods.In functional regression analysis (FRA), at least one of the dependent or independent variables is a random function.Chiou, M€ uller, and Wang (2004) and Kokoszka and Reimherr (2017) review several types of FRA, namely the scalar response model (the responses are scalars, the regressors are curves), the functional response model (the responses are curves, the regressors are scalars), and the fully functional model (both responses and regressors are functional objects).These models are usually linear, but some nonlinear models are also studied; for instance, Matsui (2019) considers functional logistic regression models.In this article, the functional response model is considered.By way of motivation, we mention a few real data examples, which can be studied using this model.The simulations and real data applications described here concern ergonomics data and graduates data.In the first case, the right elbow angle curves of drivers are modeled by the three coordinates of the target.The second data set contains the number of university graduates recorded for different regions in several consecutive years and the scalar indices of transport accessibility of cities located in the regions.Chiou, M€ uller, and Wang (2004) considered the well-known medfly data set, where age-specific reproduction in terms of daily eggs laid is dependent on dietary doses.
In the functional response model, we consider the general linear hypothesis testing problem.This problem was earlier concerned by, among others, Shen and Faraway (2004), Zhang (2011, 2013), and Smaga (2021), and one of its applications is in variable selection.These publications proposed tests based on aggregations of the pointwise sums of squares due to hypotheses (SSH) and errors (SSE).The aggregations are made by integration or by taking a supremum of these functions.The L 2 -norm-based test uses the integral of pointwise SSH, and among different approaches to the distribution of this test statistic, the naive method seems to work best (Zhang, 2013).This test will be called the L 2 N test for short.The F-type test statistic is a fraction of the integrals of pointwise SSH and SSE.However, this method is outperformed by the globalizing pointwise F (GPF) test and the Fmax test, which are respectively based on the integral and supremum of a fraction of pointwise SSH and SSE.
In FDA, an alternative approach is based on projections.Cuesta-Albertos and Febrero-Bande (2010) used this approach perhaps for the first time.They applied it to multiway analysis of variance for functional data.Some extensions of this article are proposed in Cuesta-Albertos et al. (2019), G orecki and Smaga (2017), and Mel endez, Giraldo, and Leiva (2021) for goodness-of-fit testing for the scalar response model, multivariate functional analysis of variance, and one-and two-sample problems for functional data, respectively.The idea of such tests is to project the functional data onto real data, where we can apply known test procedures.In this work, we adapt this idea to the general linear hypothesis testing problem in the functional response model.We present the theoretical background for the proposed projection test and consider different choices of its hyperparameters.The simulation studies suggest that the new projection tests may outperform the known tests in terms of power for highly correlated functional data.However, there is no one test, which has the best power in all cases considered.Therefore, we also propose a simple combining test, which has optimal or almost optimal power in all scenarios.
The remainder of this article is organized as follows.Section 2 presents the general linear hypothesis testing problem in the functional response model.In Section 3, we show how to reduce this problem to the hypothesis testing problem in the usual multiple regression model.A kind of equivalence between the null hypotheses is also established there.Section 4 presents the new testing procedures in detail.In Sections 5 and 6, the simulation studies and real data application are described.Section 7 concludes this article.

Linear hypotheses in the functional response model
In this section, we formulate the general linear hypothesis testing (GLHT) problem in the functional response model (FRM).
Since the FRM is well presented in Zhang (2013, Chapter 6), we use his notation.Let y i be the response functions defined on the interval ½a, b, a, b 2 R, a < b, while x i ¼ ð1, x i1 , :::, x ip Þ > , i ¼ 1, :::, n denote the ðp þ 1Þ-dimensional vectors of predictors, which are random variables.Of course, 1 corresponds to the intercept.In the FRM, we assume that the relation between the response functions y i and the vectors of predictors x i can be described as follows: where i ¼ 1, :::, n, t 2 ½a, b, b ¼ ðb 0 , b 1 , :::, b p Þ > is the ðp þ 1Þ-dimensional vector of unknown coefficient functions, and v i are independent and identically distributed stochastic processes with zero mean function.The processes v i are called subject-effect functions, and it is assumed that they absorb the measurement error process, when the observations are measured with error.Let y ¼ ðy 1 , :::, y n Þ > , X ¼ ðx 1 , :::, x n Þ > be the vector of response functions and the matrix of vectors of predictors.Assume that n > p þ 1 and the matrix X is of full column rank.Using this notation, we obtain the pointwise least squares estimator of the vector of coefficient functions b : bðtÞ In the FRM given in Equation (1), the GLHT problem can be formulated as follows: We are interested in testing the following linear hypotheses where C is a given q Â ðp þ 1Þ matrix of full row rank, and c is a given q Â 1 vector of functions, which is usually a zero vector.
The most common linear hypotheses being particular cases of the GLHT problem are as follows.Let 0 p , I p , and e l, pþ1 be respectively the p Â 1 zero vector, the identity matrix of size p, and the ðp þ 1Þ Â 1 vector whose lth element is one and others zero, l ¼ 1, :::, p þ 1: When C ¼ ð0 p , I p Þ and c ¼ 0 p , we test the statistical significance of the FRM given in Equation (1).Rejecting the corresponding null hypothesis allows us to test the statistical significance of each coefficient function using C ¼ e > l, pþ1 for l ¼ 1, :::, p þ 1, and c ¼ 0: These hypotheses are used in test-based variable selection methods (see, for example, Smaga, 2021;Zhang, 2013).

Projected FRM and GLHT problem
In this section, we present a relationship between the FRM and the GLHT problem and hypothesis testing in an appropriate multiple regression model.This relationship is a kind of justification for the projection-based test presented in the next section.
Let us assume that the functions y 1 , :::, y n , b 0 , b 1 , :::, b p , and v 1 , :::, v n belong to the Hilbert space L 2 ð½a, bÞ of square-integrable functions equipped with the inner product Multiplying Equation (1) by a fixed nonzero function w 2 L 2 ð½a, bÞ, we obtain where i ¼ 1, :::, n, t 2 ½a, b and x i0 ¼ 1: Integrating Equation (4) over the interval ½a, b, we obtain where i ¼ 1, :::, n: Let y w i ¼ hy i , wi, b w j ¼ hb j , wi, and v w i ¼ hv i , wi, i ¼ 1, :::, n, j ¼ 0, 1, :::, p: Note that y w i and v w i are random variables, while b w j are unknown (real) parameters.Therefore, the FRM given in Equation ( 1) is reduced to the following multiway linear regression model where i ¼ 1, :::, n and b w ¼ ðb w 0 , b w 1 , :::, b w p Þ: Of course, this reduction is conditional on the function w.
Now, let us focus on the GLHT problem given in Equation (3).Let where C l (resp.c l ) is the lth row of the matrix C (resp.lth element of the vector c) corresponding to the lth linear hypothesis, l ¼ 1, :::, q: Then, the null hypothesis given in Equation ( 3) is equivalent to C l bðtÞ ¼ c l ðtÞ for all l ¼ 1, :::, q and t 2 a, b ½ : Multiplying the above equality by a function w, and then integrating the result over the interval ½a, b, we obtain where l ¼ 1, :::, q, C lj is the jth element of the vector C l , j ¼ 0, 1, :::, p, and c w l ¼ hc l , wi: Let c w ¼ ðc w 1 , :::, c w q Þ > : Observe that the hypotheses given in Equation ( 6) are equivalent to the null hypothesis which is the general linear hypothesis in the regression model given in Equation ( 5).
To sum up the above considerations, the FRM and the GLHT problem formulated in Equations ( 1) and ( 3), respectively, imply the multiple regression model and null hypothesis given in Equations ( 5) and ( 7), respectively.This relationship is conditional on a function w, and the converse implication does not hold, in general.However, we can establish an almost sure equivalence for the null hypotheses H 0 and H w 0 : To do this, we will follow the argumentation of Cuesta-Albertos and Febrero-Bande (2010).First, we recall the following result.
Theorem 3.1 (Theorem 4.1 of Cuesta-Albertos, Fraiman, and Ransford 2007).Let H be a separable Hilbert space, and let l be a Gaussian distribution on that space such that each of its one-dimensional projections is nondegenerate.Assume that P and Q are two different probability distributions on H.If P is determined by its moments, then where p h is the orthogonal projection onto the one-dimensional subspace generated by h, and ðP p À1 h ÞðBÞ ¼ Pðp À1 h ðBÞÞ for a Borel set of the one-dimensional subspace spanned by h.
By Laha and Rohatgi (1979, Sections 7.5 and 7.6), a Borel probability measure l on H is called Gaussian if each of its one-dimensional projections is Gaussian.It is nondegenerate if, in addition, each of its one-dimensional projections is non-degenerate.The characteristic function of the Gaussian distribution l has the form , where h 2 H, m 2 H is the mean of l and S is the covariance operator of l, which is a positive, trace-class operator on H: An example of a projection p h : H ! H is given by Naturally, in our case, we will take H ¼ L 2 ð½a, bÞ: There are known conditions which guarantee that a distribution is determined by the moments.One of them, which we shall use, is the Carleman condition, namely that the absolute moments m r :¼ Ð kxk r dPðxÞ are finite and satisfy P 1 r¼1 m À1=r r ¼ 1: Using Theorem 3.1, we prove the following result.
Theorem 3.2.Under the above assumptions about the FRM and the GLHT problem, if l is a Gaussian distribution on the space L 2 ð½a, bÞ such that each of its one-dimensional projections is nondegenerate, then Cb 6 ¼ c implies that Proof.Since Cb 6 ¼ c, then for at least one l, we have C l b 6 ¼ c l , l ¼ 1, :::, q: Let l Ã be such that C l Ã b 6 ¼ c l Ã , and let P (resp.Q) be the probability distribution on L 2 ð½a, bÞ concentrated on C l Ã b (resp.c l Ã ).It is well known that the Carleman condition is satisfied by the distribution concentrated in a single point.Since and the proof is completed.By Theorem 3.2, for w 2 L 2 ð½a, bÞ chosen at random employing the Gaussian distribution l, if the null hypothesis H 0 given in Equation (3) does not hold, then for l-almost every such w, the null hypothesis H w 0 given in Equation ( 7) also fails.All of this indicates a kind of equivalence between the null hypotheses H 0 and H w 0 , which justifies the projection test for the GLHT problem, to be presented in the following section.

Tests
In this section, we propose new test procedures for the GLHT problem in the FRM.Namely, we consider projection tests and a combining test.

Projection test
The relationship presented in the previous section indicates that the GLHT problem in the FRM can be reduced to solving this problem for the multiple regression model given in Equation ( 5).This reduction is made by projections of the functional responses y i onto corresponding real observations y w i , i ¼ 1, :::, n: The main idea of the projection test is to project the functional responses onto the real line, and then apply the appropriate test procedure for the null hypothesis given in Equation ( 7).Nevertheless, there are two main drawbacks to this procedure.First, when we apply this procedure twice with different functions w, we can obtain different decisions.The second drawback is a potential loss of power, resulting from the fact that the whole function is replaced with just one real number, losing some information.To solve these problems, Cuesta-Albertos and Febrero-Bande (2010) suggest repeating the projection test k times using k random projections, and then aggregating the results according to multiple testing.The correction of the p-values can be obtained using the procedure of Benjamini and Hochberg (1995), which controls the false discovery rate (FDR; see Remark 4.1 for further explanation).Therefore, the whole procedure of the projection test is as follows: 1. Select, with Gaussian distribution, functions w r 2 L 2 ð½a, bÞ, r ¼ 1, :::, k: 2. Compute the inner products y w r i ¼ hy i , w r i and c w r l ¼ hc l , w r i for i ¼ 1, :::, n, r ¼ 1, :::, k, l ¼ 1, :::, q: 3.For each r ¼ 1, :::, k, apply the chosen test for testing the null hypothesis H w r 0 : Cb w r ¼ c w r using the observations ðy w r i , , where p ð1Þ p ð2Þ ::: p ðkÞ are the ordered p-values p 1 , :::, p k : We observe that the above procedure requires some further explanation.First of all, which Gaussian distribution should we use in Step 1? Of course, there are many possibilities.Cuesta-Albertos and Febrero-Bande (2010) used standard Brownian motion, while Febrero-Bande and Oviedo de la Fuente (2012) and G orecki and Smaga (2019) considered Gaussian white noise.These two possibilities are independent of the data.On the other hand, Cuesta-Albertos et al. ( 2019) proposed a data-driven method.We adapt this for the FRM.First, we compute the eigenpairs fð km , êm Þg of the functional principal components (FPCs) of y 1 , :::, y n : Second, we choose m n :¼ min s ¼ 1, :::, n À 1 : for a variance threshold q (q ¼ 0.95 by default).Finally, we generate the Gaussian process w m n :¼ P m n m¼1 g m êm , where g m $ Nð0, s m Þ and s m is the sample standard deviation of the scores in the mth FPC.In the simulations and real data example, we will use these three methods and call them the BM, GWN and FPC projection tests, respectively.For more discussion about the choice of Gaussian distribution, we refer the reader to Cuesta-Albertos et al. (2019, Section 4).
The second question is: how many projections are appropriate?This is not an easy question, and the number k depends on the problem, the data, the Gaussian process chosen, etc.In the functional analysis of variance (Cuesta-Albertos and Febrero-Bande 2010; G orecki and Smaga 2017), a value of k close to 30 is suggested to obtain a satisfactory power of the projection test.On the other hand, for the scalar response functional regression model, Cuesta-Albertos et al. ( 2019) indicate that it is sufficient to take only a few random projections.In the experimental part of this article, we will study this problem by considering k ¼ 10, 20, :::, 100: Finally, what test is to be used in Step 3?Here again, there are many possible test procedures.The choice may be preceded by an inspection of the projected data, for example, to check normality.For simplicity, we will use the classical test for the GLHT problem in the linear regression model implemented in the function linearHypothesis() of the car package (Fox and Weisberg 2019).
Remark 4.1.Benjamini and Hochberg (1995) showed that the procedure with p-value used in Step 4 controls the FDR for independent tests.On the other hand, Benjamini and Yekutieli (2001) extended this result for specific dependent tests, i.e. the so-called positive regression dependency (PRD).This implies that under the PRD tests, the projection test and the combining test proposed below are at most at the desired significance level.For more detail, we refer to Cuesta-Albertos and Febrero-Bande (2010, p. 546).

Combining test
As we will see in the remainder of this article, there is no one test which has the best power for all cases considered.The compared tests are the three projection tests and the three best tests from the previous papers, i.e. the L 2 N, GPF, and Fmax tests (see Section 1).In our experiments, it is shown that the power of the tests strongly depends on the correlation between functional observations at different time points.For functional data, there is usually a high correlation, but this is not always the case (see the real data example in Section 6).Therefore, in practice, the best test should be chosen after some inspection of the data, which may not be easy to perform.As a possible way of avoiding the inspection part of data analysis, we propose the combining test, which is intended to be independent of the correlation case.
Let us now construct this combining test.For highly (resp.less) correlated functional data, the GWN projection (resp.L 2 N) test is usually the most powerful.To construct a test with good power performance in all cases, we combine these two tests into one, called the combining test.In this test, we first perform the GWN projection test, and then the L 2 N test.The p-value of the combining test is as follows: , where p ð1Þ p ð2Þ are the ordered p-values of these two tests; that is, we once again apply the correction of Benjamini and Hochberg (1995).This test will not be the best in general, but it will be at least close to the best one in all scenarios.

Simulations based on ergonomics data
As mentioned above, the new tests will be compared numerically with three existing tests in terms of size control and power.The competitors are the L 2 N test of Zhang (2013) and the GPF and Fmax tests proposed in Smaga (2021).These tests control the type I error quite well and have good power behavior, as was shown in Smaga ( 2021).Thus, they are interesting competitors, although they are tests of completely different types than the projection tests (see Section 1).
For a fair comparison, we first consider simulation studies based on the ergonomics data set 1 , as were also conducted in Shen and Faraway (2004), Zhang (2011), andSmaga (2021) for the GLHT problem in the FRM.The data were collected by researchers at the Center for Ergonomics at the University of Michigan to investigate the motion of automobile drivers.For this purpose, the angle formed at the right elbow between the upper and lower arms of a single subject was measured 3 times for each of 20 locations within a test car and on an equally spaced grid of points over a period of time.For convenience, this period was rescaled to the interval ½0, 1: The dependence on time implies that these observations can be treated as functional data.Since the number of design time points varied from observation to observation, the data were reconstructed by cubic smoothing splines and then evaluated at m ¼ 100 equally spaced design time points over the interval ½0, 1: Moreover, Zhang (2013) noted that among 60 functional observations, there was one outlier; this was removed before analysis.Figure 1 presents the ergonomics data after reconstruction.
Remark 5.1.The simulation results based on the ergonomics data set presented in this article are conducted for a level of resolution m ¼ 100.To answer the comment by the Reviewer about the effect of different levels of resolution on the performance of the tests, we conducted additional simulations, which results are presented and discussed in the Supplementary Materials.The main conclusion is that the level of resolution seems not to affect this performance.

Experimental setup
According to the above description, in the ergonomics data set, we have four variables.Namely, the right elbow angle curves y i are the functional observations, while the three coordinates ða i , b i , c i Þ of the target, where a i , b i and c i represent the "left to right", "close to far" and "down to up" directions, respectively, are the scalar predictors, i ¼ 1, :::, 59: To describe the relationship between them, Shen and Faraway (2004) proposed the following quadratic FRM: where i ¼ 1, :::, 59, t 2 ½0, 1: We use this model to generate artificial functional observations according to the model given in Equation ( 1), with the predictors being the coordinates a i , b i and c i from the ergonomics data, while the subject-effect functions and the coefficients are as speciefied below.
The true parameter vector b is set to the value of the estimator b given in Equation (2) obtained for the ergonomics data set, with the following modification.Since we use the test procedures for testing the null hypothesis H 0 : b 6 ¼ b 8 ¼ b 9 ¼ 0, we set b j ¼ d bj for j ¼ 6, 8, 9.For the size study, d ¼ 0, while for power comparison, d ¼ 0:1, 0:2, 0:3, 0:4, 0:5 for q ¼ 0:1, 0:3, 0:5, 0:7, 0:9, respectively.This choice guarantees that the powers of the tests are easy to compare.The above null hypothesis is the particular null hypothesis given in Equation ( 3), and we have C ¼ ðe 7, 10 , e 9, 10 , e 10, 10 Þ > and cðtÞ ¼ 0 3 for t 2 ½0, 1: The empirical size and power of the tests are computed as proportions of rejections of the null hypothesis among 1, 000 simulation samples.For the bootstrap GPF and Fmax tests, the 1, 000 bootstrap samples were used to estimate the p-values.For simplicity, the significance level a is set to 5%.The procedure was implemented in the R program (R Core Team 2021), and the code is available from the author.

Simulation results
The simulation results are presented in Figures 2-3 (Setting 1) and Figures 1-4 (Settings 2-3) in the Supplementary Materials.First of all, we can observe that they are very similar for all three settings, which suggests a certain robustness of the tests to the type of distribution.One noticeable exception is that the L 2 N test is slightly too liberal in Setting 2, i.e. for a heavy-tailed distribution.
Size control (Figure 2, Figures 1-2 in the Supplementary Materials).In general, the known tests control the type I error quite well.However, under a high correlation, the L 2 N test may tend to over reject the null hypothesis.On the other hand, the L 2 N and GPF tests may be very conservative for the case with low correlation.The three projection tests never exceed the upper limit on the empirical size, i.e. 6.4%.Their sizes usually decrease with an increase in the number of projections k.However, after about k ¼ 30, 40, they stabilize and exhibit conservative behavior.The exception is the GWN projection test, which has higher empirical sizes for low correlation (q ¼ 0:9) than for the other cases, but still controls the type I error level.The combining test usually has empirical sizes between those of the L 2 N and GWN projection tests since it is a mixture of these.It reduces the highly liberal or conservative behavior of its components, and Power (Figure 3, Figures 3-4 in the Supplementary Materials).First of all, the empirical powers of each test decrease as the correlation of the functional data decreases (i.e.q increases).Next, we can observe that the most powerful tests depend on this correlation.For highly correlated functional data (q ¼ 0:1, 0:3), the GWN projection test and the combining test have the highest power.On the other hand, the L 2 N, GPF, and combining test procedures usually outperform the remaining ones for moderate (q ¼ 0:5) and low correlation (q ¼ 0:7, 0:9), where the GWN projection test has lower power.In the case of very low correlation (q ¼ 0:9), the BM and FPC tests also have very high power, but this does not hold in the other case.The power of the best projection test is usually stable as the number of projections k changes.A slight exception can be observed for highly correlated functional data.There the power stabilizes after about k ¼ 30, 40.The same is true for the combining test.For the worst projection test, the power may rapidly increase (e.g. the GWN test for q ¼ 0:9) or slowly decrease (e.g. the FPC test for q ¼ 0:3).
Recommendation To sum up, since the functional data are usually highly correlated, the GWN projection test and the combining test can be recommended for practical analysis.However, the combining test has an advantage over the GWN test in that its power is close to that of the best test under a lower correlation.The number of projections may not matter, but if it does, it is suggested to take k ¼ 30 or greater.
Remark 5.2 (Variable selection).In Smaga (2019), variable selection methods using the test procedures were investigated for the FRM.Their performance was strongly dependent on the power of the tests; the higher the power, the better the ability to detect the significant predictors.Including the new tests in such an investigation does not change this conclusion, and the results of variable selection experiments are similar to the power results given above.Thus, we omit them.However, we mention that backward elimination with the GWN projection test and the combining test (resp.the BM and FPC tests) give the best (resp.worst) choice of predictors for the ergonomics data set, in the sense of the conditional prediction error defined in Chiou, M€ uller, and Wang (2004, Section 4).This is expected, since the ergonomics data are quite highly correlated, in which case, the GWN projection test and the combining test are the most powerful.

Illustrative real data example
The results of the GWN projection test and the combining test (resp.the BM and FPC tests) applied to the ergonomics data set are similar to those of the Fmax (resp.L 2 N) test obtained in Smaga (2021), and therefore they are not shown (see also Remark 5.2).For this reason, we present here another real data example, which reveals some practical aspects of the new tests.Namely, we use the data investigated by Krzy sko et al. (2018).They considered several variables describing higher education in the 16 Polish regions (voivodeships) in the period from 2002 to 2016.These data were taken from the Local Data Bank (https://bdl.stat.gov.pl), which is large Polish database containing data on demographics, the economy, the environment, public finance, society, etc. extremely low correlation (q ¼ 0:9), this test strongly outperforms the remaining test procedures.Nevertheless, in general, the recommended tests are the GWN projection test for high correlation and the combining test for all cases, except for extremely low correlation.Now, we are ready to interpret the p-values given in Figure 6.As we have already mentioned, the graduates data are less correlated than the ergonomics data, which implies that the former have moderate or low correlation.This is confirmed by the test results.Namely, the L 2 N, combining, and GPF tests reject the null hypothesis at significance level 5%.Moreover, the BM and FPC projection tests have p-values close to the significance level, so they are on the boundary between rejection and non-rejection.The p-values for different numbers of projections are stable for these two test procedures.On the other hand, the p-values of the GWN projection test increase with the number of projections, and do not suggest rejection of the null hypothesis.The same result is obtained by the Fmax test, which has an even greater p-value than the GWN test.Such a distribution of p-values and decisions resulting from the tests was expected, in view of the simulation results.In particular, the L 2 N, combining, and GPF (resp.GWN and Fmax) tests are confirmed to have better (resp.worse) power in the cases of moderate and low correlation.Finally, we can say that the number of university graduates per 1000 inhabitants depends on the index.

Conclusions
In this article, we have studied the projection and combining tests for the general linear hypothesis testing problem in the functional response model.Different choices of the hyperparameters of the projection test were investigated.Namely, the hyperparameters considered were a Gaussian process used to generate the projected data and the number of projections.Among the different Gaussian processes used, Gaussian white noise resulted in the best projection test when measurements of the functional data at any two different time points were highly correlated.This is an important finding, since the functional data are usually highly correlated in this way.Nevertheless, there are examples of moderate or low correlation in functional data, as described in this article.In such cases, the Gaussian white noise-based projection test and other projection tests were far from having optimal power.In such cases, the L 2 N and GPF tests seemed to perform better.However, the combining test overcame the dependence on the correlation.Roughly speaking, it is the Benjamini-Hochberg correction of the results of the L 2 N test and the Gaussian white noise-based projection test.Thus, the combining test uses the best of these two tests and has power at least close to the power of the best test for each correlation case.The number of projections did not usually have an important effect on the power of the projection and combining tests.The power of the projection tests for different numbers of projections was stable.However, there were some exceptions.Based on these, we can indicate k ¼ 30 as a potentially good choice, similarly as in the literature.There are other possible ways to modify the proposed tests.For particular experiments, we can consider different Gaussian processes, combinations of tests, and tests applied to the projected data.These may give rise to interesting problems for future research.

Figure 1 .
Figure 1.The ergonomics data set after reconstruction by cubic smoothing splines and removal of the outlier.

Figure 4 .
Figure 4.The graduates data set -numbers of university graduates per 1000 inhabitants.

Figure 6 .
Figure 6.P-values (as percentages) for all tests applied to the graduates data set.The tests are coded as follows: 1 -L 2 N, 2 -GPF, 3 -Fmax, 4 -GWN, 5 -BM, 6 -FPC, 7 -combining test.k denotes the number of projections, and the thick horizontal line represents the 5% significance level.