Classification of circulation types over Eastern mediterranean using a self-organizing map approach

In this study an Artificial Neural Network called Self-Organizing Map (SOM) is used in order to classify the synoptic circulation over Europe and especially Eastern Mediterranean. The classification of circulation types is an effective way of summarizing and describing the atmospheric circulation and it is useful in climatology because it provides a better understanding of the climatic variability over an area. Here, the SOM methodology is applied on winter daily geopotential height anomalies of the 500 hPa level, for the period 1971–2000. Twelve unique circulation patterns are identified. Eight of these types are characterized as cyclonic, representing 61% of the total days examined and four types are characterized as anticyclonic, representing 39% of the study period. The results of this classification are comparable to other objective classifications applied on the same study region and present a similar image. Therefore, the SOM methodology is found to be applicable and useful in the classification of circulation types.


Introduction
The study aims to present a new approach to the identification of circulation-type weather classification. A weather classification is useful tool in climatology since it provides a better understanding of the climatic variability over a certain area. Both manual and automatic techniques have been used. In manual procedures the classification is performed exclusively by one researcher, depending only on their judgment. The main disadvantages of those methods are that they are not easily replicated and that they require a lot of time, but at the same time, they allow physical interpretation of the climate variability, since the researcher has gained familiarity with the synoptic patterns of the weather maps. Nevertheless, manual methods are still used in contemporary research and circulation-type identification (Hess & Brezowsky, 1969;Kassomenos, Flocas, Lykoudis, & Petrakis, 1998;Lamb, 1972;Maheras, 1988Maheras, , 1989. In objective techniques, computer programs determine the circulation types, divide data into categories, and lack the negative aspects associated with subjective approaches (Anagnostopoulou, Tolika, & Maheras, 2009;Maheras, Patrikas, Karakostas, & Anagnostopoulou, 2000). Automatic classifications are generally based on statistical techniques such as Principal Component Analysis and Cluster Analysis (Luterbacher, Xoplaki, & Maheras, 1998;Maheras, 1988), and more recently on fuzzy logic techniques and neural networks (Bárdossy, Duckstein, & Bogàrdi, 1995;Michaelides, Liassidou, & Schizas, 2007). Automatic methods are easily applicable but they also have limitations and restrictions related to the data used and to the physical interpretation of the results.
In this study an Artificial Neural Network, called Self-Organizing Map (SOM), is used in order to classify the synoptic circulation over Europe and especially the Eastern Mediterranean during winter portion of the year. This method is gaining popularity recently since it has many advantages, such as the improved visualization of results, the fact that it also contains a neighbor modification pattern and that it effectively represents a continuum of synoptic situations, compared with discrete ones produced by most traditional methods (Hewitson & Crane, 2002).

Methods
The SOM methodology is an unsupervised learning process that codifies large, multivariate datasets onto a 2-dimensional array, or map, called the SOM, with the aim of discovering patterns in the data (Kohonen, 2001). The initial step in the SOM routine is to define a random distribution of nodes within the data space. The nodes are defined by a reference vector of weighting coefficients, where each coefficient is associated with a particular input variable. Thus, each node has an associated reference vector equal in dimension to the input data. As each data record is presented to the SOM, the similarity between the data record and each of the node reference vectors is calculated, usually as a measure of Euclidean distance. The reference vector of the Best Matching Unit (BMU) is then modified such as to reduce the difference with the input vector by some user-defined factor, or learning rate (Hewitson & Crane, 2002). In the final product, the nodes have become ordered, meaning that nodes that are close together in the map represent similar patterns due to the process of each input data sample training the BMU and those surrounding it.
There are many decisions that have to be made by the researcher, such as the SOM grid size and the topology (rectangular or hexagonal). In this study the selection of those parameters was based on both subjective and objective criteria. A rather objective criterion is the computation and evaluation of two types of errors, namely the quantization and topological error. The first one measures the average distance between each data vector and its BMU (the SOM in which it is classified), evaluating the fitting of the neural map to the original data. Therefore, a small quantization error accounts for a good clustering. Regarding the topology preservation, it is more difficult to be evaluated and it is usually done by measuring the topographical error, which is the proportion of all data vectors for which first and second best-matching units are not adjacent vectors. So the lower the topographic error, the better the SOM preserves the topology (Uriarte & Martín, 2005).
In the present study, the aim of which is to try a new classification method and compare it to existing ones, the 3 × 4 size was chosen, in order for it to be readily comparable to the 12 circulation types defined by Anagnostopoulou et al. (2009) for the same study region. The topology of the SOM was set to rectangular, since this is the most frequently used one in literature (Liu, Weisberg, & Mooers, 2006). Finally, randomly generated weights were created in the initialization process and many runs were performed in order for the one with the minimum errors to be chosen.
In this paper, the winter geopotential height anomalies x i − x stdev were used for the 500 hPa isobaric level. The 500 hPa level was chosen because it presents a strong relationship with surface variables (Tolika, Maheras, Vafiadis, Flocas, & Arseni-Papadimitriou, 2007) and at the same time, it strongly affects the weather over an area. Furthermore, anomalies were chosen over absolute geopotential values in order to eliminate the seasonality of the data and to get a clearer image of the atmospheric circulation over the study area (Anagnostopoulou et al., 2009). Therefore, the positive centers correspond to higher geopotential heights compared to the seasonal mean values and the negative centers correspond to lower geopotential heights. The 500 hPa is commonly used in the classification of circulation types (Anagnostopoulou et al., 2009;Huth, 1996;Kostopoulou, 2003;Kyselý & Huth, 2006;Maheras et al., 2000;Michaelides et al., 2007;Wallace & Gutzler, 1981), although it would be possible for other isobaric levels to be used, as well as different thickness levels, depending on the aims of the study.

Data
The SOM methodology is applied on winter daily anomalies of geopotential height at the 500 hPa isobaric level of the atmosphere, provided by the National Center for Environmental Prediction/ National Center for Atmospheric Research (NCEP/NCAR) Reanalysis project on a uniform 2.58 × 2.58 grid spacing for the broader Europe area (Kalnay et al., 1996). The exact study area is 408 W to 708 E and 158N to 808N (Figure 1). The time period examined is 1971 -2000.

Discussion and conclusions
The SOMs that are presented on the Main Map products show the 12 circulation patterns that dominate during winter over Europe for the period 1971-2000. Each of the SOMs represents a unique pattern based on the 500 hPa geopotential height anomalies. A description of each of the 12 circulation patterns provided by the SOM methodology can be found in Table 1. The patterns could be divided in two groups characterizing cyclonic and anticyclonic circulation over the study region of Eastern Mediterranean. In more detail, in the 12 different SOM nodes, the one with the highest frequency is an anticyclonic node (SOM node 2), accounting for the 14% of the total number of days examined. The cyclonic SOMs have almost the same frequencies, Figure 1. Reference map of the study area. ranging from 5% (SOM node 5) to 9% (SOM node 3). The total percentage of cyclonic patterns is higher than the anticyclonic ones during winter. In particular, 61% of the days examined belong to the eight cyclonic SOM patterns obtained here (SOM nodes 1,3,5,7,8,10,11 and 12), while 39% of the days is classified to the four anticyclonic SOMs (SOM nodes 2, 4, 6 and 9).
The results of this study are coherent with other studies using different classification methods for Eastern Mediterranean. Maheras and Anagnostopolou (2003), applied an automated circulation classification scheme on Greece for the period October to March for the 500 hPa geopotential height anomalies. The percentages of cyclonic (60.8%) and anticyclonic types (39.2%) found are almost identical to the ones presented in this study. Furthermore, Anagnostopoulou et al. (2009) used a flexible automated approach which is well established in any region of the Mediterranean, resulting into very similar frequencies of winter cyclonic and anticyclonic types compared to the ones found in the present study. The average anticyclonic percentage for two different regions of the Eastern Mediterranean, Greece and Cyprus, is 31.6% (29.8% and 33.5%, respectively). The corresponding cyclonic percentage is 68.4%. According to the results of this study, in Eastern Mediterranean winter days of the 1971-2000 period can be classified to 12 unique circulation patterns. Each pattern is characterized by homogeneous weather conditions, while the differences among the patterns are considered to be relatively marked.
Finally, the SOM approach could be a promising method in Environmental Sciences, providing a descriptive tool for summarizing atmospheric circulation, extremes and risk assessment. The SOM could serve as the link between large-scale atmospheric circulation and environmental risk Table 1. Description of SOM nodes' patterns (letters C and A in parenthesis correspond to cyclonic and anticyclonic patterns, respectively).
SOM node 1 (C): An extended cyclonic center is found over Europe, surrounded by high geopotential height anomalies. The atmospheric circulation over eastern Mediterranean is of a westerly-southwesterly component. SOM node 2 (A): Extended anticyclonic center over eastern Mediterranean in contrast to an extended and deep cyclonic center located over northern Europe and north Atlantic. The flow at the 500 hPa level is mainly from the east. SOM node 3 (C): Cyclonic center over the study region combined with a strong anticyclonic center over western Europe and central Atlantic. Another anticyclonic zone is found on the east. SOM node 4 (A): An anticyclonic zone extended over the south Europe and Mediterranean. The 500 hPa flow is mainly zonal. SOM node 5 (C): Cyclonic center over eastern Mediterranean and eastern Europe combined to a typical North Atlantic Oscillation (NAO) pattern on the west (negative NAO phase: negative anomalies over the Azores and positive over Iceland). SOM node 6 (A): Extended anticyclone over central and south Europe. Cyclonic center over northern Europe (Scandinavia -Iceland). The prevailing flow in the 500 hPa level for Eastern Mediterranean is of the northeasterly sector. SOM node 7 (C): Cyclonic zone over the study region with an anticyclonic ridge on its north. SOM node 8 (C): A cyclonic center is located over eastern Europe and Mediterranean, surrounded by anticyclonic centers. All centers are very pronounced. The flow over the study region is of a northern direction. SOM node 9 (A): Anticyclonic centers prevail above the greatest part of Europe, including the study region.
A cyclonic center is found over the Iberian Peninsula and another one on the Middle East. Similar to SOM node 6 but weaker. SOM node 10 (C): Cyclonic centers over eastern Mediterranean, central and northern Europe combined with anticyclonic centers over the Iberian Peninsula and Asia. Northwestern flow above the study area. SOM node 11 (C): Cyclonic conditions above whole Europe except Scandinavia. Zonal flow above Mediterranean. SOM node 12 (C): A pronounced cyclonic center over Mediterranean -South Europe -North Africa surrounded by anticyclonic centers. management, using extreme climatic values. In particular, the output maps of this study could be further associated to climatic parameters such as temperature, precipitation, humidity, cloudiness, and used in environmental risk assessment, predicting the occurrence of extreme weather events, such as heavy rainfall that could result in floods, absence of rainfall causing drought, high temperatures and humidity resulting in heat waves, etc.

Software
For the application of the SOM methodology on the data the R software and the Kohonen package were used (Wehrens & Buydens, 2007). All the mapping was performed in ESRI ArcGIS 10. Finally, the graph presented was created in MS Excel.