Discovering spatiotemporal flow patterns: where the origin–destination map meets empirical orthogonal function decomposition

ABSTRACT Flows are usually represented as vector lines from origins to destinations and can reflect the movements of individuals or groups in space and time. Revealing and analyzing the spatiotemporal flow patterns are conducive to understanding information underlying movements. This paper proposes a new method called the OD – EOF (Origin – Destination – Empirical Orthogonal Function) to discover important spatiotemporal flow patterns on the premise of maintaining the pairwise connections between origins and destinations. We first construct a spatiotemporal flow matrix that contains connection information between origins and destinations and temporal flow information by adding a temporal dimension to the OD map. Then, we decompose the spatiotemporal flow matrix into spatial modes and corresponding time coefficients by EOF decomposition. The decomposition results depict the prominent spatial distribution of and temporal variation in flows, with most of the spatiotemporal characteristics highly concentrated into the first few spatial modes. The method is evaluated by five synthetic datasets and a user study and subsequently applied to analyze the impact of the COVID-19 pandemic on the spatiotemporal patterns of human mobility in China during the Spring Festival travel rush in 2020 and 2021. The results show the prominent spatiotemporal patterns of human mobility during these periods under the influence of the COVID-19 pandemic outbreak and the normalization of pandemic prevention and control.


Introduction
Animal and human movements are fundamental to the environment and society (Miller et al., 2019;Schlapfer et al., 2021). The analysis of animal and human movement data is of great importance to understanding spatiotemporal dynamics of ecosystems and cities . Revealing the spatiotemporal patterns of animal and human movement is key to studying the details of animal and human behaviors (Dodge et al., 2016;Gkoulalas-Divanis & Bettini, 2018). Movement data can be represented as flows that concern OD (origin -destination) movement pairs but ignore their trajectories  and have been applied in various domains. For example, population flows are studied to predict pandemic diffusion and improve pandemic control and interventions to respond to outbreaks (Bengtsson et al., 2011;Tizzoni et al., 2014), intraurban goods flows are analyzed to understand urban dynamics and structures (Zhao et al., 2018), vehicle theft-recovery flows are investigated to develop effective police responses to criminal activities (Tao & Thill, 2016), and network-constrained taxi flows are used to detect taxi patterns on a road network at multiple scales (Kan et al., 2021).
For flow visualization, flow maps, location-based methods, and matrix-based methods are mainly used (Yang et al., 2017;Zhu et al., 2019). Flow maps connect OD flow pairs directly by lines or arrows, but spatial occlusion easily occurs as the number of flows increases (Jenny et al., 2016;Phan et al., 2005). To avoid occlusion, location-based visualizations map flow location measures but ignore the connections between OD pairs (Guo et al., 2012;Koylu & Guo, 2013). Matrixbased visualizations construct an OD matrix to visualize flows (Mohammad et al., 2004;Voorhees, 2013). Each matrix cell represents a flow from an origin to a destination, retaining the pairwise connections between flow origins and destinations. The limitation of this method type is that the geographic background of flows is lost. To overcome this limitation, an OD map, a geographically embedded matrix visualization method, is proposed that uses a two-dimensional nested structure to preserve the approximate spatial layout (Wood et al., 2013). Flows, whose origins and destinations have both spatial and temporal attributes, usually span various spatial and temporal dimensions simultaneously, making spatiotemporal flow patterns difficult to discover (Long et al., 2018). The OD map is suitable only for analyzing spatial flow patterns, not spatiotemporal ones (Boyandin et al., 2011). Although the temporal dimension can be nested in the OD map, doing so leads to lower readability and a higher cognitive burden, so spatiotemporal flow patterns are difficult to see. The empirical orthogonal function (EOF) can effectively extract the spatiotemporal flow patterns with a few spatial modes and their corresponding time coefficients due to its rapid expansion and convergence speed (Hannachi et al., 2007;Lorenz, 1956). Although the EOF has been successfully used to reveal spatiotemporal flow patterns, existing studies have usually decomposed the origin and destination matrices of flow trips separately, severing the OD pair connections, which are the most important and distinctive information set for flows. To address this problem, we integrate the OD map with the EOF to reveal the spatiotemporal flow patterns.
This paper proposes an OD -EOF method. It first organizes spatiotemporal flows into a spatiotemporal matrix by the OD map with an added temporal dimension. It then decomposes the matrix into spatial modes and their corresponding time coefficients by EOF decomposition. The decomposition results reveal several important spatiotemporal flow patterns. We evaluate the effectiveness of the OD -EOF method with five synthetic datasets and a user study and apply it to a realworld dataset. The rest of this paper is organized into the following sections. Section 2 describes the visualization and spatiotemporal analysis of flows. Section 3 provides a detailed description of the OD -EOF method. Sections 4 and 5 test the effectiveness of the method with synthetic flow data and a user study. Section6 presents a case study on population migration in China during the Spring Festival in 2020 and 2021, and Section 7 constitutes our discussion and conclusion.

Flow visualization
Flow maps, location-based methods and matrix-based methods are generally used to visualize flows. Traditional flow maps connect origins and destinations with straight or curved lines whose width or color represent the flow volume Schöttler et al., 2021). Such maps are effective for visualizing small flow datasets, but even in relatively small datasets, flow maps have the problems of spatial occlusion and cluttered displays, making them difficult to read (Guo, 2009). Appropriate visual variable design can help to improve the readability and aesthetics of flow maps based on line symbols, such as line styles (e.g. straight or curved, with or without arrowheads, etc.), line shapes (e.g. thick, thin or varying) or line color gradients (Dong et al., 2018a;Koylu & Guo, 2016). However, these options still cannot completely solve the occlusion problem, especially for large datasets. With the widespread use of large flow datasets, flow map generalization methods have been proposed to reduce the number of flows substantially with a high degree of abstraction by means of filtering and aggregation (Dodge & Noi, 2021). Nevertheless, filtering selected flows causes difficulty in providing visual overviews (Phan et al., 2005), while flow aggregation leads detail loss and long flows covering shorter flows (Cui et al., 2008;Danny & Jarke, 2009).
Location-and matrix-based methods do not map flows directly to bypass the problem of spatial occlusion (Zhu et al., 2019). Location-based flow visualizations display movement characteristics at different places and times by calculated summary statistics such as the net flow ratio in different time periods (Guo et al., 2012). This method provides observations based on flow location features (Koylu & Guo, 2013). Its limitation is the loss of connections between origins and destinations.
Matrix-based flow visualizations depict and organize flows through the construction of an OD matrix (Mohammad et al., 2004) whose rows and columns represent origins and destinations, respectively, and where the number of flows is used to color the matrix cells (Wilkinson & Friendly, 2009). In addition, ordering and encoding techniques enhance the practicality of the OD matrix in dealing with large datasets (Guo & Gahegan, 2006). The OD matrix is limited by the lack of geographic context. To overcome this limitation, an OD map is proposed to preserve (as much as possible) the spatial layout of flows in the geographical background and to address the spatial occlusion problem while retaining the connections between origins and destinations (Wood et al., 2013). In addition, MapTrix is proposed to provide geographic embedding by adding an origin and a destination map, whose locations are connected to the corresponding row and column in an OD matrix by leader lines (Yang et al., 2017). However, both the OD map and MapTrix are suitable for analyzing only spatial flow patterns, not spatiotemporal patterns. Although temporal dimensions can be added by nesting them directly into the OD map or the OD matrix of MapTrix, a cognitive burden easily results.
To support the flow analysis in the temporal dimension, time-series maps, multiple maps, space -time cubes and clustering methods have been applied. Timeseries maps can be displayed through heatmaps to visualize the temporal variation in the flow magnitudes (Boyandin et al., 2011), or they can be generated directly to uncover dynamic patterns over large spatiotemporal extents (Koylu & Kasakoff, 2022). Multiple maps can represent time-variant flows either temporally by map animation or spatially by multiple small maps (Andrienko et al., 2011;Boyandin et al., 2012). Spacetime cubes use different operations to analyze spatiotemporal flows (Bach et al., 2014). Clustering methods can cluster temporal flow snapshots based on their similarity in different periods (Von Landesberger et al., 2016). Although these methods can be used to explore the temporal flow dimension, they all have difficulty revealing important temporal patterns. In addition, as study areas and periods increase, the spatiotemporal patterns become difficult to analyze.

Spatiotemporal flow analysis
Spatiotemporal flow clustering, factor analysis, and matrix and tensor decomposition can be used for spatiotemporal flow analyses. Appropriate methods should be selected depending on the research purpose. Spatiotemporal flow clustering can be used to group flows that are spatially and temporally close in a cluster based on a spatiotemporal similarity measurement Xiang et al., 2019) or a stepwise strategy (Yao et al., 2018). This method can reveal significant spatiotemporal trends in flows from large flow datasets, but it cannot capture the continuous time changes in flows or reveal overall temporal flow patterns. Factor analysis has been applied to link each flow destination group to its strongly connected set of common flow origins (Goddard, 1970;, but such analysis require some assumptions about the factors (Zhou et al., 2021). Matrix and tensor decompositions can be adopted to extract the major spatiotemporal characteristics from the flow data (Miettinen, 2009;C. Xu et al., 2020), such as principal component analysis (PCA) (Bueso et al., 2020;Demšar et al., 2013), nonnegative matrix factorization (NMF) (Dong et al., 2018b;Gao et al., 2019;Pang et al., 2018), singular value decomposition (SVD) (Cao et al., 2019), eigen decomposition (Y. Xu et al., 2019;Zhou et al., 2021), and tensor decomposition (Cao et al., 2020;Tang et al., 2019). These decomposition methods have been successfully employed to uncover the spatiotemporal patterns of taxi trips and cycling flows by decomposing the matrices and tensors of origins and destinations, respectively. However, the methods do not retain the pairwise connection information between origins and destinations in flows, so important flow information is lost.

Method
We present an OD -EOF method to discover spatiotemporal flow patterns. Figure 1 illustrates the basic process of our method. Figure 1 shows how the spatiotemporal flows are first organized into the OD maps with different time slices (t 1 ; t 2 ; . . . ; t n ) according to the flow origin time. Element value N i represents the flow value of the i th grid in each OD map. Then, these OD maps are transformed into a spatiotemporal flow matrix by numbering the grids. N ij represents the flow value of the i th grid at time t j in the spatiotemporal matrix. Finally, by EOF decomposition, the spatiotemporal flow matrix is decomposed into spatial modes and their associated time coefficients to discover important spatiotemporal flow patterns.

Part 1 flow organization
We construct a spatiotemporal flow matrix by extending the OD map to the temporal dimension to specify the flows from the spatial and temporal dimensions. An OD map is a two-dimensional nested structure in which the destination grids are nested within the origin grids. Therefore, the first-dimensional grids represent the flow origins, and the second-dimensional grids represent flow destinations. Then, each flow can be paired with the corresponding, unique, two-dimensional grid in the OD map (Wood et al., 2013). The OD map can be transformed into a DO map by swapping the origin and destination dimensions. The OD and DO maps allow for two-way comparisons (Yang et al., 2017), the OD map is easier to visually observe origin areas and the DO map is easier to visually observe destination areas.
In the OD map, as the first-dimensional grids represent the origins, all flows from the same origin area are in the same first-dimensional grid. In the DO map, the firstdimensional grids represent the destinations, so all flows to the same destination area are in the same firstdimensional grid. Figure 2 demonstrates two methods for constructing a grid in the OD or DO map. Red represents the flow from Area A to Area B, and blue represents the converse. The columns and rows of the constructed grid are numbered to easily represent each grid cell. For example, grid cell (a, 1) indicates that the grid is located in Column "a" and Row "1." The first construction method constructs the grid by dividing the geographic space directly into a regular grid with the same numbers of rows and columns ( Figure 2a). In geographic space, the red flow moves from grid cell (b, 1) to cell (b, 2), and the blue flow moves from cell (b, 2) to cell (a, 1). In the OD map, the red flow is represented by cell (bb, 12), and the blue flow is represented by cell (ba, 21). In the DO map, the red flow is represented by cell (bb, 21), and the blue flow is represented by cell (ab, 12). The second construction method uses a spatially ordered treemap, in which each cell represents an area, and the relative spatial arrangement of cells is preserved according to the geographic area locations. The blank cells are deleted so the meaningful cells can be observed more easily (Wood & Dykes, 2008) (Figure 2b). In the OD map, the red flow is represented by cell (ab, 12), and the blue flow is represented by cell (ba, 21). In the DO map, the red flow is represented by cell (ba, 21), and the blue flow is represented by cell (ab, 12). The first construction method is applicable to discrete flows between locations (see Appendix B for an example), while the second is more suitable for aggregated flows between areas (e.g. administrative areas; see Section 6 for an example).
After the temporal dimension is added to an OD map, a three-dimensional OD map is constructed, as shown in Figure 3a) Each flow can be regarded as f row ; f col ; t j À � , referring to the flow of row f row and column f col in the OD map at time t j . Then, the threedimensional OD map is transformed to a twodimensional matrix, namely, the spatiotemporal flow matrix (ST Flow m�n ), as shown in Figure 3b) In ST Flow m�n , the spatial part of the three-dimensional OD map is transformed to be one-dimensional. m is the number of grid cells in the OD map, and n is the number of time slices. Element value N ij of FID i ; t j À � has the same meaning as previously, determining the color and color depth of the grid cell.

Part 2 spatiotemporal decomposition
EOF decomposition (Lorenz, 1956) has been widely employed in meteorology and climatology (Fang et al., 2015), geology (Chang & Chao, 2011), oceanology (Hou et al., 2014), and epidemiology . The EOF addresses the problem of data sparsity by a dimensionality reduction, which decomposes the spatiotemporal field into its spatial modes and associated time coefficients, revealing the main low-dimensional characteristics (Y. Chen et al., 2011). It also provides a robust numerical framework for extracting dataset patterns (Neha & Pasari, 2021). Through EOF decomposition, a two-dimensional spatiotemporal flow matrix (ST Flow m�n ) is decomposed into spatial modes S m�p and time coefficients T p�n in Formula (1). Spatial modes reflect the spatial distribution of flow values, and time coefficients reflect the temporal variation in flow values. p is the number of typical spatial modes. Thus, the main spatiotemporal characteristics of ST Flow m�n can be fitted through these p spatial modes and associated time coefficients (Hannachi et al., 2007).
The general steps for an EOF decomposition are as follows: Step 1 Calculate the covariance matrix (C m�m ) of ST Flow m�n .
Step 2 Calculate the eigenvalues and spatial modes by the Jacobi method (Rutishauser, 1966).
where Λ m�m is a diagonal matrix with eigenvalues (λ 1;2;...;m ) as diagonal elements, which are normally sorted in descending order (λ 1 > λ 2 > . . . > λ m � 0). S m�m constitutes all of the spatial modes, and each column represents a spatial mode. Each nonzero eigenvalue λ corresponds to a spatial mode in S m�m . The values of the eigenvalues represent the weights of their corresponding spatial modes, denoting the variance explained.
Step 3 Calculate the variance contribution rate of the k th spatial mode (v k ) and the cumulative variance contribution rate of the anterior k spatial modes v k reflects how much important information in the dataset can be explained by the k th spatial mode. The spatial mode with the highest v k is the first mode. cv k reflects the cumulative information that can be explained by the anterior k spatial modes. Then, choose anterior p spatial modes to explain at least the amount of fixed variance, e.g. 85% (p < m). Thus, S m�p with p typical spatial modes can be obtained.
Step 4 Project S m�p onto ST Flow m�n . Time coefficients T p�n can be calculated according to transformed Formula (6) of Formula (1). Each row consists of the time coefficients corresponding to their spatial modes.
Each spatial mode is visualized by an OD map whose cells reflect the values of the spatial mode component. The positive (negative) time coefficients suggest a positive (negative) correlation between the time coefficients and the spatial modes. When the time coefficient and the cell are both either positive or negative, the absolute value of the time coefficient (or the cell) is larger, and the flow value reflected by the cell is higher.
In particular, because EOF decomposition is orthogonal, the spatial modes are independent of each other, and so are the time coefficients. Therefore, different spatiotemporal flow patterns can be found by analyzing the spatial modes and the corresponding time coefficients after EOF decomposition.

Synthetic data
We conduct experiments with five sets of synthetic flow data to verify the ability and effectiveness of the OD -EOF method in extracting the spatiotemporal characteristics. To simplify the flow organization, we use OD maps to organize the five sets of synthetic flow data. The first OD map construction method is adopted to construct the grid. For the synthetic flow datasets, the flow values of the first time slice are the initial flow values, and the flow values vary over 30 time slices. The initial values are displayed in the grid cells in the OD map. We consider the random and the aggregate patterns of initial values in space (Wood et al., 2013) as well as the same and the different temporal variation patterns. The spatial patterns are shown in Figure 4. To generate the random pattern of the initial values in space, we generate 10,000 flows with uniformly distributed random origins and destinations and represent them using the OD map (Figure 4a). The initial values of the cells in the OD map are the flow numbers. To generate the aggregate pattern of the initial values in space, we generate 10,000 flows with uniformly distributed random destinations and Gaussian distributed origins around the center point and represent them using the OD map ( Figure 4b). The initial values of the cells in the OD map are the flow numbers. To generate the same temporal variation pattern in time, we design a temporal variation curve and assign it to all cells. To generate the different temporal variation patterns, we design two temporal variation curves and assign one to some of the cells and another to the other cells. Combining the spatial patterns of the initial values and the temporal variation patterns, we generate five synthetic datasets.
The five synthetic flow datasets are shown in Figure 5. The initial values of the grid cells are displayed in the OD maps, and the temporal variations of cells are shown as temporal variation curves.
Dataset 1: The random pattern of the initial values in space is combined with the same temporal variation pattern (Figure 5a).
Dataset 2: The aggregate pattern of the initial values in space is combined with the same temporal variation pattern (Figure 5b).
Dataset 3: The random pattern of the initial values in space is combined with the different temporal variation patterns. The two temporal variation curves are shown in Figure 5c). The blue curve is assigned to the cells marked with blue star, and the black curve is assigned to the other cells.
Dataset 4: The aggregate pattern of the initial values in space is combined with the different temporal variation patterns. The blue curve is assigned to the cells marked with blue star, and the black curve is assigned to the other cells (Figure 5d).
Dataset 5: In this mixed case, the flow directions marked with blue star are the converse of those marked with green star. These cells are assigned a high initial value of 200, and the initial values of the other cells are random. There are two temporal variation curves. The blue curve is assigned to the cells marked with blue star, and the black curve is assigned to the other cells ( Figure 5e).
The spatiotemporal patterns of the five synthetic flow datasets are obtained by the OD -EOF method. The variance contribution rates of the top two spatial modes are shown in Table 1, and the spatial modes and their corresponding time coefficients are shown in Figures 6-9. Since all of the grid cells in Datasets 1 and 2 have the same temporal variations, the dominant spatiotemporal characteristics in these two datasets are almost all included in their first modes, whose variance contribution rates are as high as 100%. Datasets 3, 4 and 5 are all explained by two spatial modes because some of the grid cells have different temporal variations, but the variance contribution rates of their first modes are all   large (greater than 95%), indicating that most of the grid cells have the same temporal variation. For Datasets 1 and 2, the OD maps of the first modes are essentially identical to the spatial distribution of their initial values (Figure 6), that is, the maps show the random (Figure 6a) and aggregate patterns (Figure 6b) of the initial values, respectively. The variation in their time coefficients is completely consistent with the temporal variation in the original data. This result indicates that under the same temporal variation, our method can extract the typical spatial distribution and temporal variation.
The first modes of Datasets 3 and 4, both represent the same temporal variation of most grid cells, as most cells in the first modes have small differences in the positive values in the OD maps, as shown in Figures 7  and 8. The second mode of Dataset 3 highlights eight randomly distributed red cells, and the second mode of Dataset 4 highlights eight aggregated red cells whose absolute value is relatively higher than that of the blue cells. The temporal variations in the highlighted cells in Datasets 3 and 4 show a downward trend. This is consistent with the preset spatiotemporal characteristics of Datasets 3 and 4 that have the random and aggregate patterns of initial values in space, respectively. The highlighted cells have different temporal variations compared to the other cells. This result suggests that when the temporal variations are different our method not only extracts the same spatiotemporal characteristics from most of the cells but also identifies other important spatiotemporal characteristics in some of the cells.
For Dataset 5, in the first mode, most of the grid cells are red in the OD map. This result reflects the same temporal variation of most of the cells, and the cells marked with green star are highlighted with dark red. In particular, the cells marked with blue star, whose flow direction is opposite to that of the cells marked with green star, are blue, which indicates the different temporal variations of former from that of the other cells. In the second mode, the cells marked with blue star are    highlighted in dark red (Figure 9). The temporal variation of these cells is discovered in the second mode and is different from that of most of the cells in the first mode, consistent with the preset spatiotemporal characteristics in Dataset 5. This result suggests that our method can distinguish the different temporal variations in flows with opposite flow directions. Thus, by applying our method to these typical synthetic flow datasets, the extracted spatiotemporal patterns match the preset spatiotemporal characteristics of the original data. This result indicates that our method is useful and effective in revealing hidden spatiotemporal patterns in flow datasets.

User study
We conduct a user study to evaluate the effectiveness and efficiency of the OD -EOF method in revealing spatiotemporal patterns. The OD -EOF method is compared with the traditional OD map and EOF decomposition. For spatiotemporal flows, the traditional OD map nests the temporal dimension directly in the OD map, while EOF decomposition decomposes the flow matrices of origins and destinations separately. The user study is conducted via a questionnaire on native HTML5 webpages.

Questionnaire design
In the questionnaire, synthetic flow datasets are visualized using the three methods. The participants are asked to answer the three questions for each method to test whether they could accurately identify the spatiotemporal patterns. In real applications, the spatiotemporal flow patterns are usually complex. Thus, we use a synthetic flow dataset of mixed cases in the questionnaire. To avoid interference from the former method on the latter method when completing the questionnaire, three similar sets of synthetic flow data are preset and randomly assigned to the three methods.
In the questionnaire, the same single-choice questions (Q1, Q2 and Q3) are set for the three methods to compare the results of answering these questions based on the three methods. Q1 and Q3 are designed for the temporal pattern tasks, while Q2 is for the spatial pattern task. In addition, Q1 is for the common temporal flow pattern, while Q2 and Q3 are for special flows and their special temporal pattern (see Appendix A for questionnaire details).

Participants
We have 30 valid responses. The participants are found at universities, all have an academic background of undergraduate or above, and each participant receives a gift after completing the questionnaire. While the participant homogeneity can affect the universality of results due to the limitation of participants' sources, it also improves the statistical effectiveness (Dong et al., 2018a). The user study has a within-subject design in which each participant is asked to answer the questions on all methods to avoid the influence of individual differences. As most participants are unfamiliar with these methods, they are introduced in detail to each method before answering the questions. Since the OD -EOF method is proposed based on the principles of the OD map and EOF decomposition, there is a certain order for the participants in learning these methods, which requires learning the OD map and EOF decomposition first and the OD -EOF method second. Thus, we adopt the order in which the participants learn these methods (OD map, EOF decomposition, and OD -EOF method) as the order of methods in the questionnaire to ensure that the participants' thinking is clear and not confused when answering the questions. When the participants answer the questions, we find that they spend much more time reading the visualization results of these methods than reading the questions. Therefore, when answering the questions, the order of methods has little effect on the results of the participants' response time. In addition, the OD -EOF method requires the longest learning time even though it has the shortest response time, and the OD map has a longer learning time than EOF decomposition.

Results
Response accuracy and time are the evaluation indicators. We use all response times, as they did not differ significantly between the correct and incorrect answers to each question. The results show that the differences in response times among the methods for all questions are statistically significant (Kruskal-Wallis test, all p < 0:001). The pairwise comparisons of the response times among the methods all have significant differences (Dunn-Bonferroni test, all p < 0:050), except for that between EOF decomposition and the OD -EOF method on Q2 (Dunn-Bonferroni test, p ¼ 0:520). Figure 10 shows the response accuracy and time of the three methods for each question. For the temporal pattern task, on Q1, all methods perform well, and the response time of the OD -EOF method is the shortest. On Q3, the response accuracy of the OD -EOF is high, and its response time is the shortest. In particular, the response accuracy of EOF decomposition and the OD -EOF method is the same on both Q1 and Q3, but the response time of EOF decomposition is slightly longer than that of the OD -EOF method. This result indicates that EOF decomposition requires a longer time for visualizing temporal patterns than does the OD -EOF method. The reason could be that the participants are confused about the separate results of the flow origins and destinations obtained by EOF decomposition, so the participants require more time to answer the question. These results reveal that the OD -EOF method is the most effective and efficient for visualizing the common and special temporal flow patterns. EOF decomposition is the second most effective, and the OD map is the worst. For the spatial pattern task, the response accuracy of the OD -EOF method is the highest, and its response time is short. These features suggest that the OD -EOF method is more effective and efficient for visualizing special flows. The OD map and EOF decomposition perform poorly on this task, demonstrated by a low response accuracy.
In summary, the participants can use the OD -EOF method to visualize spatiotemporal flow patterns more effectively and efficiently than they can using the other two methods. The OD map is the least efficient method in all tasks, and visualizing special flows and their special temporal pattern is difficult. These problems are probably due to the low readability and cognitive burden caused by adding the temporal dimension to the OD map. EOF decomposition performs well in the temporal pattern tasks, but it performs the worst in the spatial pattern task. This result could be attributed to the fact that EOF decomposition decomposes the origin and destination flow matrices separately rather than decomposing the entire flow matrix. These three visualization methods must be studied beforehand to understand the results, especially for participants unfamiliar with the methods.

Human mobility case study during the Spring Festival travel rush in 2020 and 2021
The Spring Festival is the most solemn and important traditional festival in China. The Spring Festival travel rush, which occurs around the Spring Festival, is the largest periodic human mobility event in China, with college students and migrant workers being the main migrant population (Lai & Pan, 2020). Due to the outbreak and development of the COVID-19 pandemic, COVID-19 prevention and control policies in the initial outbreak and normalization stages may have had a certain impact on the spatiotemporal patterns of human mobility during the Spring Festival travel rush in 2020 and 2021. By comparing the spatiotemporal patterns of human mobility during these periods, this section aims to analyze the impact of the COVID-19 pandemic on the spatiotemporal patterns of human mobility at a provincial scale to support the development of future COVID-19 pandemic prevention and control policies.
The dataset is obtained from Baidu Migration Big Data (https://qianxi.baidu.com/). It includes the migration scale index data and the migration proportion data for 31 provinces in China from January 10 to March 15 in both 2020 and 2021. As Taiwan, Hong Kong and Macau have no data, these areas are not analyzed. The migration scale index data record the scale of the immigrant or emigrant populations in each province. The migration proportion data of a certain province record the proportion of the immigrant (emigrant) populations from (to) different provinces. We obtained the migration intensity between any two provinces by multiplying the migration scale index data and the migration proportion data. Our method adopts the second OD map-construction method to construct the grid (Ai et al., 2016), as shown in Figure 11. We organize the migration intensity data into a spatiotemporal flow matrix. Using the OD -EOF method, several considerable spatial modes and the corresponding time coefficients are obtained.
The variance and cumulative variance contribution rates of the top two spatial modes in 2020 and 2021 are shown in Table 2. The cumulative variance contribution rates of the top two spatial modes are as high as approximately 90% and 86% for 2020 and 2021, respectively. Thus, the most typical characteristics of human mobility in the dataset are contained in these two spatial modes. Their first modes alone account for a very large proportion of the total variance (approximately 86% and 71% for 2020 and 2021, respectively), representing the dominant spatiotemporal patterns. Their second modes represent the important spatiotemporal patterns in addition to the first modes. In particular, the variance contribution rate of the first mode for 2020 is much higher than is that for 2021. The higher variance of the first mode for 2020 suggests that this mode can explain more spatiotemporal information, indicating that the diversity of human mobility patterns in 2020 was lower than that in 2021. This difference may be related to the measures issued by the government after the outbreak of the COVID-19 pandemic in 2020, which greatly reduced people's mobility. The human mobility pattern in 2021 is relatively scattered and random because of the gradual recovery of travel activities after the COVID-19 pandemic. Figures 12-15 show the OD and DO maps of the first and second modes and their corresponding time coefficients for 2020 and 2021. In the spatial modes, the grid cells with positive (negative) values are shown in red (blue). Darker colors indicate greater absolute cell values. The spatiotemporal flow matrix is decomposed into the form of multiplying the spatial matrix by the time coefficient matrix. Therefore, when the time coefficients are positive (negative), the red (blue) cells, especially the dark red (blue) cells, represent strong human mobility. Figures 12 and 13 illustrate the first modes and the corresponding time coefficients for 2020 and 2021. Both time coefficients demonstrate a remarkable conversion from positive to negative around the time of the Spring Festival in 2020 and 2021. However, a large difference remains between the time coefficients for 2020 and 2021. In 2020, as shown in Figure 12, the time coefficients drop sharply to negative values after reaching a peak on January 21 and gradually become stable at a lower level of negative value after January 25 (Spring Festival in 2020). In addition, most of the grid cells are red on the OD and DO maps for 2020, indicating a sharp decline in human mobility after January 21 and a stable low human mobility level after the Spring Festival. This result is because Chinese health authorities adopted the strictest prevention and control measures for the COVID-19 pandemic on January 22. Wuhan City, Hubei, implemented the city closure measure on January 23, and by January 26, 17 cities in Hubei      had successively been put under lockdown . The outbreak of the COVID-19 pandemic also pushed people to cancel their travel plans. These factors all resulted in a large reduction in human mobility. In addition, to prevent further spread of the COVID-19 pandemic caused by returns to work or school after the Spring Festival, policies were adopted. These included delaying school opening times or launching online teaching and requiring companies and factories to postpone the resumption of work (Z. Chen et al., 2020). Thus, human mobility after the Spring Festival remained at a low level. Therefore, the first mode of 2020 suggests that human mobility patterns were deeply affected by the COVID-19 pandemic.
For 2021, as shown in Figure 13, the dark red cells in the OD map are mostly distributed in the central areas of the southeast coastal provinces (such as Guangdong, Jiangsu, Zhejiang and Shanghai). In addition, the time coefficients for the period before the Spring Festival remain at a high positive value and are relatively stable. These results indicate the strong human mobility pattern of returning home from the southeast coastal provinces to the central inland provinces and the temporal dispersion of returns home before the Spring Festival. This result is because people were required to adhere to the principle of staggered travel and batched holidays. The dark blue cells in the DO map are mainly concentrated in the central areas of the southeast coastal provinces, and the time coefficients reach their peak negative value on February 18, indicating a strong human mobility pattern from the central inland provinces to the southeast coastal provinces after the Spring Festival. Most people started to resume work at a similar time when the Spring Festival holiday ended on February 17. In particular, the absolute values of the time coefficients drop sharply for February 26, indicating the sharp decline of human mobility on this day. This result is because February 26 is the Lantern Festival, a day when the whole family gathers at home to celebrate instead of going out. Then, the time coefficients reach a small negative peak for the period after February 26, indicating that human mobility was strong after the Lantern Festival. This result is because some people choose to resume work and return to school after the Lantern Festival. The rush of returns home before the Spring Festival and returns to work or school after the Spring Festival indicates a reverse human mobility pattern around the Spring Festival, which is affected by the migrant composition. This population consists mainly of college students and migrant workers: their large-scale mobility determines the dominant human mobility patterns of the Spring Festival travel rush (Lai & Pan, 2020). Thus, the first modes of 2021 reflect the gradual recovery of human mobility patterns. Figures 14 and 15 illustrate the second modes and corresponding time coefficients for 2020 and 2021. In the OD and DO maps for 2020, almost all the grid cells in Hubei and Beijing are red, and the time coefficients of 2020 are mostly positive for the period before February 20, which reveals the special human mobility patterns of Hubei and Beijing. Hubei, the center of the COVID-19 pandemic outbreak in China, received aid in the form of medical care and supplies from various provinces. Beijing is the capital of China and serves as its political headquarters and cultural center. In particular, the time coefficients for dates before February 20 showed a certain fluctuation and several small peaks on January 25, February 4 and February 11. This result indicates strong human mobility at these three times because China repeatedly sent medical workers to Hubei for medical treatment and allocated multiple batches of central disaster relief materials for the mass transfer, resettlement and isolation observation of people. The time coefficients slowly begin to show higher negative values after February 20, and most of the cells are blue in the OD and DO maps for 2020, which indicates the diversity of human mobility. This result is because China reported a progress milestone made in controlling the COVID-19 pandemic on February 20, namely, that the number of new cases was decreasing steadily, which encouraged some resumption of human mobility as people returned to work or school in some areas. Thus, the second mode of 2020 represents human mobility patterns under the influence of the COVID-19 pandemic before February 20 and the recovery of various human mobility patterns after February 20.
As shown in Figure 15, the time coefficients have a staggered positive and negative distribution. Therefore, the second mode of 2021 demonstrates diverse and atypical human mobility patterns in space and time. These patterns include the human mobility patterns of not only returning home, to work or school, but also of sightseeing and tourism.

Discussion and conclusion
Flows can represent individual or group movements that concern the origins and destinations of flows. Flows span various spatial and temporal dimensions simultaneously, but the traditional OD map is suitable for analyzing only spatial flow patterns, not spatiotemporal patterns. While the temporal dimension can be added to the OD map directly, low readability and a cognitive burden result, and the spatiotemporal flow patterns are difficult to discover. To address these problems, we introduce EOF decomposition into the OD map with an added temporal dimension and propose an OD -EOF method of revealing the spatiotemporal flow patterns. We organize the flow data into a spatiotemporal flow matrix by adding the temporal dimension to the OD map. Then, the spatial modes and corresponding time coefficients are obtained by EOF decomposition to interpret the spatial distribution of and temporal variation in flows. We evaluate the OD -EOF method with five synthetic flow datasets and a user study. We then apply this method to discover the spatiotemporal patterns of human mobility in China during the Spring Festival travel rush in 2020 and 2021. The decomposition results demonstrate important spatiotemporal patterns of human mobility under the impact of the COVID-19 pandemic outbreak and the normalization of COVID-19 pandemic prevention and control. Thus, the results are helpful as a reference and support for the future promulgation and implementation of COVID-19 pandemic prevention and control policies. For example, we find that human mobility in 2020 declined sharply and remained at a stable low level after the Spring Festival. This result suggests that in severe cases, lockdowns can be implemented in the worst affected provinces, and policies such as extending the Spring Festival and advocating home quarantining can also be implemented to reduce mobility. Human mobility in 2021 remained at a stable high level before the Spring Festival. This result suggests that in stable cases, policies such as staggered travel and batched holidays are still needed to control the COVID-19 pandemic.
To preserve the connections between flow origins and destinations, this paper constructs an OD -EOF method to analyze spatiotemporal flow patterns. The OD -EOF method could be suitable for long periods but not for a large number of areas (Coşkun et al., 2020). As the number of areas increase, the visual analysis of the decomposition results will become more difficult. This problem might be solved by grouping small areas that are closely related (Goddard, 1970), using interactive methods to display some areas selectively (Yang et al., 2017), or setting a threshold for the spatial mode values to filter some areas . These methods should be studied in future work.
The OD -EOF method can also be extended to other decomposition methods based on the type of the flow data and the research perspective. For example, when the hot spot origin and destination areas of flows need to be extracted at the same time, SVD can be considered. The SVD results require multiple singular vectors to explain, and the interpretability of a single vector is not strong (Cao et al., 2019). When nonnegative constraints are required to decompose the flow matrix into different comparable patterns, NMF can be adopted. The NMF parameter needs to be determined by the residual sum of squares, multiple experiments or other methods (Dong et al., 2018b;Gao et al., 2019). For handling multidimensional flow data, tensor decomposition can discover meaningful flow patterns in multiple dimensions and can reflect the intensity of the dimension connections, but the analysis of the results can be complicated (Cao et al., 2020;Tang et al., 2019).

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
The work was supported by the National Natural

Data availability statement
The data that support the findings of this study are openly available in Baidu Migration Big Data at https://qianxi.baidu. com and Didi Chuxing GAIA Initiative at https://gaia.didi chuxing.com.