Learning-based dynamic ticket pricing for passenger railway service providers

This article proposes a data-driven ticket dynamic pricing methodology for passenger railway service providers. There is a finite purchasing horizon, and the ticket prices should be set under varying conditions to affect the customer booking behaviour. A three-step process including machine learning and optimization tools is employed to maximize the revenue under a constrained train capacity. First, a multi-layer perceptron artificial neural network (MLP-ANN) model is proposed to predict the demand intensity due to seasonal situations using the ticket reservation data. Then, some regression models as price elasticity functions are used to quantify the effects of price, seasonal conditions and competition on the company’s sales. Finally, a nonlinear integer programming model is proposed to maximize the total revenue in the purchasing horizon. The results of the numerical studies on the Fadak Five-Star Trains’ reservation data indicate that the proposed methodology has high-grade potential to improve the service provider’s revenue.


Introduction
Revenue management (RM) applies optimization tools to maximize revenue under a constrained capacity, based on perception, prediction and influencing consumer behaviour. In some services, such as airline and passenger railway seats, a significant revenue may be lost if the capacity is not sold appropriately and on time. Talluri and Van Ryzin (2004a) performed a general review of RM methods and applications. They stated that RM addresses three main classes of demand-based decision, namely, structural decisions (e.g. negotiations, auctions, segmentation mechanisms, volume discounts and cancellation), price decisions (e.g. setting prices, discounts and bundling (Eghbali-Zarch, Taleizadeh, and Tavakkoli-Moghaddam 2019) and quantity decisions (e.g. capacity allocation, channels and release time).
One of the main branches of RM is pricing, which performs a substantial role for a business in conducting its competitive strategies (Keyvanshokooh et al. 2013). Dynamic pricing is a specific price strategy, in which prices frequently change owing to new conditions in supply and demand. It is also called real-time pricing (RTP) because it should respond to real-time demand changes (Tao et al. 2020). A suitable pricing approach may be taken according to customers' behaviour, competitors and costs. When each of these market parameters changes, the best price may be changed (Westermann 2006).
Transportation is believed to play a vital role in improving a country's economic situation, and rail transportation has notable advantages over other transportation modes in safety and long-distance transportability (Alikhani-Kooshkak et al. 2018;Alikhani-Kooshkak et al. 2019). RM methods were primarily studied and used in air transportation. Accordingly, a significant share of current studies in RM tools and techniques is devoted to the airlines.
RM in rail transportation is categorized into railway passenger RM and railway freight RM. Since the problem described in this article lies in the passenger rail transportation field, the literature review (Section 2) focuses on studies of railway passenger revenue management (RPRM) only. Although dynamic pricing has been investigated in different applications, especially airlines, it has been neglected in passenger rail transportation. RM was first used in railway transportation in the 1990s, while, in prior decades, pricing in railways was based on a traditional method based on the travelled distance.
The level of human intelligence in dynamically setting prices may be inefficient because the sales expert should consider seasonality, the number of empty seats, competitors' prices and many other parameters at any moment. The problem can be more complex when the problem dimensions, such as the number of trains, coaches, service classes and purchasing horizons, increase. Thus, in such situations, the application of artificial intelligence and machine learning can be advisable. This article proposes a dynamic ticket pricing expert system for a railway passenger service provider. The study consists of statistical analyses and machine learning tools for quantifying the demand intensity and the price-demand relationships. A simple mathematical model is proposed for the single-train dynamic pricing problem under competition among the trains of the passenger rail service providers. Several numerical cases from an Iranian five-star train are presented to analyse the performance of the proposed method. Therefore, the contributions of this study are as follows: • A machine learning tool is proposed to forecast the demand intensity from sales data.
• Classification approaches are used for partitioning the whole sales data to study price-sales relationships. • Regression models predict sales amounts under different market conditions and in the presence of competition. • A mathematical model for dynamic pricing is proposed to optimize train revenue under operational constraints. • Several real cases from a long-distance train are investigated to evaluate the performance of the methodology.
The rest of this article is arranged as follows. In Section 2, the RM literature in RPRM is reviewed. Section 3 characterizes the main problem and its specifications. Section 4 presents the proposed datadriven methodology for railway ticket dynamic pricing. Section 5 illustrates the results obtained by using the proposed approach for the case study. Finally, conclusions and ideas for further research are presented in Section 5.

Literature review
RM problems in rail passenger transportation are mainly associated with finding optimal prices and seat capacity allocation. Rail transportation service providers and the owners invest in trains and coaches to provide the capacity required to satisfy demand . Capacity allocation in RPRM refers to assigning available seats or coaches to the suitable route, train, service class and fare class. Setting the optimal number of coaches for a rail system needs a trade-off between the cost of owning coaches and the potential costs or lost opportunity cost related to not assigning suitable coaches . Since allocating the seats to different fare classes (i.e. multi-fare) can be interpreted as a pricing problem, they are not distinguished as two completely independent problems. In multi-fare RPRM, You (2008) proposed a two-fare multi-leg model to extend the single-fare model introduced by Ciancimino et al. (1999) and used linear programming (LP) and particle swarm optimization algorithms. Terabe and Ongprasert (2006) applied an LP model for seat allocation in various origin-destinations of high-speed trains. They offered three objective functions that maximize the revenue and the number of passengers and minimize the number of lost customers. Dutta and Ghosh (2012) proposed an RPRM system including forecaster, optimizer and simulator tools. They used an LP model to allocate seat capacity in a rail network owing to the expected marginal seat revenue (EMSR).
Although ticket pricing is a critical concern for railway service providers, studies on setting optimal prices for railway passenger transportation are rare, and RPRM problems are mainly limited to some types of capacity allocation. After analysing the price elasticity of demand, Bharill and Rangaraj (2008) proposed a pricing model, which considers overbooking and cancellation. Cirillo, Hetrakul, and Toobaie (2011) used a multinomial logit model (MNL) for Amtrak Acela Express to prepare the passenger choice model of booking time. They determined the passenger demand in response to prices by linear regression and used nonlinear models to maximize the predicted revenue. Xiaoqiang, Lang, and Jin (2017) proposed a dynamic pricing model for traveller groups on Chinese trains. They assumed that booking requests follow a Gaussian distribution, and they described the size of each group by a Poisson distribution. Then, the authors considered that purchasing probability affected by the ticket price follows the logit model.
One of the shortcomings of the available research on train ticket pricing is that it rarely takes into account competition effects, which are considered in this article. Xu et al. (2012) investigated the effect of price variations in travel among air and high-speed rail transportation. They used a Stackelberg game model to study the changes in passenger flow and profits. Johnson and Nash (2012) predicted the effects of fare and service variations on travel demand and market share in the presence of competition. Although they used a simulation procedure to study the market behaviour of a long-distance international route, no pricing tool is presented. Wang, Wang, and Zhang (2016) focused on the seat allocation problem in RPRM, when there are several competing services, such as regular-speed and high-speed rail, air and car transportation. They did not consider pricing but adjusted the optimal number of seats for each cabin class in each train. They employed stochastic models, including passengers' choice behaviour and Monte Carlo simulation, for modelling and evaluating seat assignment scenarios. Zhan, Wong, and Lo (2020) proposed a mixed-integer linear programming (MILP) model for ticket pricing on high-speed trains. Their mathematical model considers corporate revenue optimization and customer social benefits simultaneously. They applied utility functions for modelling the demand of each segment of passengers and prepared a quantitative analysis for the social equity issue in the train timetabling problem.
The use of historical sales data can enhance the understanding of customer behaviour, which is vital for preparing a suitable revenue optimization model. Indeed, a data-driven methodology is proposed to optimize revenue under variable market conditions. There are a few studies on data-driven RPRM models in the literature. Jiang et al. (2015) used historical data, applying ensemble empirical mode decomposition and a grey support vector machine to predict travel demand. They recommended a dynamic model to allocate the seats on high-speed trains to several origin-destinations. A genetic algorithm (GA) is proposed to solve the mathematical model. Kaushik (2016) applied the expectation-maximization (EM) theory, which was first introduced by Talluri and Van Ryzin (2004b), to the sales data to improve the ticket pricing in RPRM, and concluded that this theory was inefficient in the point of significant data gathering and preprocessing. Sun et al. (2018) used purchase data to examine the high-speed train selections of passengers, and used two machine learning approaches to show the customers' choices on Chinese trains. Some authors have integrated pricing with the other types of tactical decision, for example, capacity or inventory handling (Salehi et al. 2020). Yin et al. (2021) proposed a scientific model to minimize the crowdedness of stations to create the optimal passenger-oriented coordinated timetables in the rail networks with time-dependent demand. They applied adaptive large neighbourhood search algorithms to solve the mathematical model and used historical data from the Beijing rail network to evaluate their method. Another similar passenger flow data-driven analysis was conducted by Mo et al. (2019). Hetrakul and Cirillo (2014) used MNL choice and latent class models to predict the purchasing time of rail passengers and employed linear functions to forecast demand. They proposed an integrated pricing and capacity allocation methodology. Pratikto (2020) performed a general model, including demand forecasting, ticket pricing and seat allocation, for an RPRM problem. He estimated travel demand by applying hierarchical Bayes estimation, randomized first choice simulation and cubic spline interpolation methods, and solved the model by enumeration rules. He used the EMSR heuristic method to solve the model considering two passenger segments with four price classes. Yan et al. (2020) proposed a nonlinear programming model for a high-speed rail passenger network with probabilistic demand. They considered seat inventory control decisions for revenue optimization. Kankanit and Moryadee (2021) introduced a dynamic pricing scheme for Thailand's high-speed trains concerning the changes in service specifications and the time of purchasing. They applied a linear demand function to find optimal prices based on the historical data. Alamdari, Anjos, and Savard (2021) introduced machine learning methods in RPRM, presenting new heuristic feature engineering methods. They examined their studied machine learning techniques in the context of a major European railway service provider. Kamandanipour et al. (2020) introduced a data-driven RPRM methodology for combined dynamic ticket pricing and capacity allocation in single-train multi-service mode. They proposed a nonlinear stochastic model to maximize total profit considering fixed and variable costs. A simulation approach set into a simulated annealing metaheuristic solves the uncertain model. The proposed methodology does not recommend any procedure for predicting demand intensity in the time horizon ahead. Furthermore, the study is not directly concerned with competition or the behaviour of competitors. Therefore, the authors were motivated to improve the earlier study to cope with demand prediction and competition.

Problem description
Revenue optimization for passenger transportation is vital because there is a constrained seat capacity to sell. The service provider's revenue may be lost if it cannot sell the seats in a limited purchasing horizon at a suitable price. The main concern of this article is revenue maximization for long-distance passenger rail transportation. Because of the high price sensitivity of rail passengers, dynamic pricing can play a vital role in stimulating travel demand, concerning the intensity of competition between the passenger rail service providers.
The problem refers to a rail service provider that wants to maximize its revenue on a specific route by changing the ticket prices. There are enough historical sales data to help the company to recognize the customers' behaviour. There is a limited purchasing horizon for each departure day, from the first time that customers can buy the tickets to the departure date. Setting the train ticket prices at any time of the purchasing horizon under changing market conditions is the vital concern of the service provider.
Each departure day of a train has its specific demand intensity, mainly caused by two calendarrelated types of seasonal situation and the day of the week. The demand intensity for any departure day is unknown and should be forecast by the historical sales data. Also, there are several passenger rail service providers, which have competing trains on the same route. The customers' behaviour in the face of the company's ticket prices, the competitors' ticket prices and the remaining time to departure are unknown and can be quantified by the historical data.
At any point in the purchasing horizon, a certain number of seats is available for reservation, which the company tries to sell efficiently. The company has pricing preferences, including lower and upper bound prices and a maximum allowed price change between two successive purchasing days. The last item prevents excessive price fluctuation in the purchasing horizon.
A case study on a long-distance train is presented to illustrate the proposed methodology in practice. Fadak Five-Star Trains, which serves high-quality rail transportation services in Iran, depart on long-distance routes overnight. Each train coach consists of 10 four-berth sleeper compartments.
In this article, the Tehran-Mashhad route, between the country's capital and the most important destination for religious tourism in Iran, is studied.
Thus, the problem assumptions are as follows: • The problem is single route and one way.
• The problem is single class.
• There are some rail transportation competitors in the market (about 20 trains for an itinerary).
• The final objective is revenue maximization for a passenger rail service provider.
• The model sets ticket prices for each day in the purchasing horizon.

Demand intensity classification model
Demand forecasting plays a vital role in a suitable pricing tool. Proper understanding of customers' behaviour in the face of different market conditions and the company's pricing policies can strengthen the RM approach. In the first step of the methodology, the demand intensity due to seasonal situations is predicted and presented by a seasonality label. Such a demand intensity index depends only on calendar-related events for the departure day, not on the company's pricing and sales policies. In the investigated case, the Persian (or Iranian) calendar is the country's official calendar, which has its special events. In addition, the studied route is a religious (Islamic) tourism corridor. Demand is also affected by the Hijri (Islamic, Muslim or Arabic) calendar owing to the Muslims' special events and holidays. Furthermore, the day of the week of a departure significantly influences the travel demand. As a result, these three date-related factors have combined effects on the travel demand for a specific departure day. The three date-related factors (Persian calendar, Hijri calendar and day of the week) move towards each other and do not have fixed relative positions. Therefore, these combined effects are modelled by a classification tool to predict the seasonality of any future departure date. It is worth noting that the seasonal situations and calendar events in the Persian and Hijri calendars have different effects on the travel demand. Therefore, special Hijri calendar events are classified into some classes (e.g. types 1, 2 and 3), and special Persian calendar events are divided into some other classes (e.g. types 1 and 2). It is assumed that the demand intensity for a type 1 date is stronger than type 2 departure dates. In the same way, demand for type 2 departures is greater than for type 3 days. It should be noted that if a departure day is a holiday or has a specific calendar event, travel demand for a few days (usually 1-2 days) before and after that special day will also be affected. Usually, the demand intensity for these near preceding days is stronger than for that specific day. On the other hand, demand for the near succeeding days can be weakened, and even lower than a regular day, while the demand for the return journey on these succeeding days may be enhanced. Consequently, the identified factors (departure day attributes) that influence seasonal and date-related demand are as follows: 1. The day of the week (Week_Day). 2. The remaining days to the nearest 'Hijri calendar event type 1' (Rem_H1). 3. The remaining days to the nearest 'Hijri calendar event type 2' (Rem_H2). 4. The remaining days to the nearest 'Hijri calendar event type 3' (Rem_H3). 5. The remaining days to the nearest 'Persian calendar event type 1' (Rem_P1). 6. The remaining days to the nearest 'Persian calendar event type 2' (Rem_P2). 7. The days passed from the nearest 'Hijri calendar event type 1' (Pas_H1). 8. The days passed from the nearest 'Hijri calendar event type 2' (Pas_H2). 9. The days passed from the nearest 'Hijri calendar event type 3' (Pas_H3). 10. The days passed from the nearest 'Persian calendar event type 1' (Pas_P1). 11. The days passed from the nearest 'Persian calendar event type 2' (Pas_P2). For example, 2021-05-13, which is equivalent to 1442-10-01 in the Hijri calendar, is Eid al-Fitr (an Islamic holiday labelled as type 1), and travel traffic to the religious destinations (e.g. Mashhad) increases for about 1-3 days before that event. However, the travel demand for the exact day of Eid al-Fitr is not very significant, and in the next few days, the demand becomes much weaker.
In addition, the day of the week and events in the Persian calendar may reinforce or weaken the travel demand intensity. For that reason, the combined effects of these factors should be considered. A classification method for forecasting the future demand conditions is proposed using the identified factors as inputs. In addition, the passenger load factor (e.g. the percentage of available seats filled with passengers) in each origin-destination is used as the output for a demand prediction model. It is recommended that the entire market's daily travel statistics on that route are used as the target parameter of the prediction model to reduce dependence on the company's pricing, marketing or managerial policies. However, if there are a lack of sufficient data, the company's historical sales data may be helpful. In this study, passenger traffic data are gathered from governmental institutions, and the daily load factor is calculated to illustrate the seasonality conditions. In the next step, each range of load factors is mapped to a seasonality class, according to Table 1. Note that revenue and price units in this article are represented in Iranian rials (IRR).
A machine learning tool is used for creating a demand intensity model for seasonality factors. Indeed, a multi-layer perceptron (MLP) artificial neural network (ANN), namely MLP-ANN, model is prepared. The ANN is a black box to create complicated models for nonlinear relationships (Fan et al. 2016). The MLP is a class of feedforward ANNs (Fan, Chang, and Lin 2021) that includes a set of connected nodes that map the inputs with outputs, and the outputs are a weighted summation of inputs to the node adjusted by a simple activation function (Gardner and Dorling 1998). The influential seasonality factors for each day in the learning data set (i.e. market database) are the input features. The total load factor of the market for that day is the response, which is then mapped to the corresponding seasonality class. Figure 1 shows a schematic view of the MLP-ANN model proposed for seasonality forecasting.

Price elasticity
For setting prices in different situations, the effect of the ticket price on sales should be evaluated by a mathematical function. The following four factors mainly affect price sensitivity for rail passengers in any situation in the purchasing horizon: 1. Demand seasonality factor for the time of departure. Demand seasonality situations, which are defined in Section 4.1, affect price sensitivity. Market analysis shows that, in low season conditions, the customers are more price sensitive than in travel peaks. Seasonal situations are classified into four classes [i.e. low (L), medium (M), high (H) and very high (VH)], as shown in Table 1, to investigate the impacts of seasonality. 2. Time from reservation to departure. The price sensitivity of the passengers who reserve seats in the earlier days before the departure differs from those of the later ones. To analyse the influence of remaining time to departure, the purchasing horizon is categorized into four divisions (1-4), as shown in Table 2.  3. Price set by the company at the time of purchase. Passengers tend to purchase a particular ticket with a cheaper fare. Therefore, measuring the price sensitivity according to the other mentioned influencing factors is imperative. 4. Competitors' prices are shown at the time of purchase. Regarding the widespread sales systems (e.g. online platforms), the passenger rail service providers' prices are visible to the customers. The passengers may purchase tickets from the other rail companies when they reduce their prices. Thus, higher prices cannot lead to higher sales in competitive conditions.
The price elasticity is estimated from the company's historical sales data. First, purchasing conditions are classified into some classes, consisting of combination pairs of the demand seasonality classes (i.e. L, M, H and VH) and the remaining time to departure classes (i. e. 1-4). Then, the price elasticity functions are estimated in every 16 pairs (segments). In this regard, a simple polynomial linear regression model is used for each segment, which quantifies the effect of the two other factors on the company's sales, including the price set by the company and the competitors' average prices. Table 3 shows a sample price elasticity quantification by polynomial linear regression models. In this table, the demand functions are in the form of , where d refers to the forecast demand when the company selects price p, and the competitors set price CP on average.

Revenue optimization
After estimating the price elasticity of demand, a revenue optimization model can be structured. First, the demand intensity class of the departure day is determined (see Section 4.1), and then, the four corresponding price elasticity functions are specified (see Section 3.2). For each departure day T, a revenue optimization model must be solved separately. The indices, parameters and variables are defined as follows. Indices: • T is the index for departure days and also refers to a purchasing time horizon • t is the index for the number of days remaining to departure, where t ∈ {0, 1, 2, . . . , T − 1} Parameters: • CA(T) is the total seat capacity available for departure day T • α tc(t),sc(T) , β tc(t),sc(T) and γ tc(t),sc(T) are the coefficients of the polynomial linear regression model related to each seasonality class sc(T) in each purchasing period tc(t) • sc(T) is the seasonality situation of departure day T presented by some classes (see Table 1) • tc(t) is the remaining time from reservation to departure presented by some classes (see Table 2) • P lb and P ub are the price lower and upper bounds specified by the company • MC is the maximum price change allowed between two successive purchasing days • CP(t) is the average price set by the leading competitors (the other rail service providers' trains) on purchasing day t for departure day T Variables: • revenue(T) is the total revenue for departure day T • p(t) is the price set for departure day T to be proposed to the customers on purchasing day t • d(t) is the predicted demand for departure day T on purchasing day t under proposed price p(t) and average competitors' price CP(t) • s(t) is the estimated sales for departure day T on purchasing day t A nonlinear integer programming (NLIP) model is proposed to optimize the train revenue in a finite purchasing horizon: (1)

d(t) = α tc(t),sc(T) .p(t) + β tc(t),sc(T) .CP(t) + γ tc(t),sc(T)
The objective function (1) refers to revenue maximization for departure day T, which is simply the product of the proposed price and predicted sales. Constraint (2) predicts the demand on purchasing day t by a polynomial linear regression function, which is rounded down to the nearest integer number. Constraint (3) calculates the sales on purchasing day t for the predicted demand and the remaining capacity. Constraint (4) guarantees that the price fluctuates within its allowed range, specified by the sales manager. Constraint (5) preserves the model from extreme price changes, ensuring that two successive prices are not different by more than MC (a parameter specified by the company). Constraint (6) provides that the decision variables are integer numbers.
The proposed model belongs to the NLIP class and can be solved by many optimization software programs in a reasonable time.

Results
In this section, a real case is studied to illustrate and evaluate the proposed methodology. Fadak Five-Star Trains prepare Iranian rail passenger transportation services. Tehran-Mashhad, which has the most passenger traffic in Iran, is selected for the study. The economy class, which accounts for about half of the company's capacity, is chosen because of its price-sensitive market. The company allocates about eight coaches with 320 seats daily to economy class for this origin-destination. Owing to the constraints, the aim is to maximize the train's revenue for a specific service class and origin-destination pair. Note that some problem parameters have been changed proportionally to preserve the confidentiality of the company's information.

Demand intensity classification
To forecast the demand intensity concerning seasonal and calendar events, an MLP-ANN is proposed. This is a standard machine learning tool. The 11 influencing factors (listed in Section 4.1) are related to three departure date attributes: day of the week, Persian calendar events (the official calendar of Iran) and Hijri calendar events (the Islamic calendar).
A learning data set contains about 1860 records of departure days for about 5 years used to analyse the mixed effect of the 11 factors. The MLP-ANN structure for this problem has one hidden layer between the input and output layers (see Figure 1). The training function in the hidden layer is Bayesian regularization backpropagation, the layer initialization function is Nguyen-Widrow, and the transfer functions for the hidden and output layer are hyperbolic tangent sigmoid (tansig) and linear (pureline), respectively. The number of neurons in the network is 30. A trial-and-error policy determines the architecture parameters. Fifteen per cent of the total data is randomly selected as the test data. This ANN is modelled in MATLAB ® 2016 software.
The results of the first step in the demand intensity classification model (i.e. forecasting the total market load factor) are as follows. Figure 2 demonstrates the normalized mean squared error (MSE) for training and testing data. The plot shows that the best training performance is about 0.0026, and this value is about 0.0080 for the test data. Figure 3 depicts the neural network training and testing regression plot. The results show that the correlation coefficients between the outputs predicted by the MLP model and targets (i.e. actual responses) for training data, test data and all data are 94.76, 89.69 and 93.90%, respectively.
The other results show that the mean absolute percentage error (MAPE) for the training and test data are about 0.04 and 0.01, respectively. Therefore, the numerical tests show that the MLP-ANN   performs appropriately in predicting the total load factor. However, precise estimation of the load factor is not needed, and determining the corresponding demand intensity class corresponding to Table 1 (L to VH) is adequate.

Price elasticity functions
According to Section 4.2, the mathematical functions are needed to evaluate the effect of price changes on travel demand. The functions are obtained using 2 years' sales data for the company. Each record has four attributes: demand intensity class for the departure date (according to Table 1), time from purchasing to departure class (according to Table 2), the company's ticket price, and the average prices set by the competitors at that purchasing time for the corresponding departure date. The competitors' prices are gathered continuously from online ticket sales platforms. A simple polynomial linear regression model is set for each combination of demand intensity class (L, M, H and VH) and remaining time to departure class (1-4). Each regression model is a function of the company's price and the average of the competitors' prices. An alternative would be to calculate a weighted average from competitors' ticket prices based on the similarity of services and competitive forces. The regression models set for price elasticity are shown in Table 3, and the statistical measures for the levels of goodness of fit are presented in Table 4. D, P and CP in Table 3 are the predicted demand, company's price and the competitors' price set for a specific departure date in a specified purchasing time, respectively. As shown in Table 3, the price set by the company has a direct negative effect on the company's demand; however, the competitors' price has the reverse effect. The intensity of these impacts is different in various situations. For example, when the purchasing time is near the departure (column 1), the price sensitivity generally decreases if the travel seasonality demand increases (from top to bottom in column 1). However, in the other columns, the difference between rows is not marked. Note that the number of records from lowseason days is low, and it is difficult to interpret the results. The very low coefficient of CP in the box related to the VH demand intensity and time to departure class 1 shows that the demand in this situation is almost independent of the competitors' price because of the shortage of free capacity in the market.

Revenue optimization
To evaluate the proposed RM methodology, the optimization module is run for a numerical example under special conditions, and then a sensitivity analysis is carried out. Lastly, some special departure dates are selected to evaluate the results of the proposed method. The optimization model is solved with LINGO 14, an optimization software program from Lindo Systems.

Numerical example
The optimization module must run periodically (e.g. daily) to handle new dynamic conditions. This numerical example assumes that one is at the start of a purchasing horizon, and there are 31 days left to the departure day (T = 31). Therefore, t = 30 refers to the current day and t = 0 refers to the departure day.
As presented in Figure 4, to compete with the decreasing prices of the competitors, the company must reduce its price in the time horizon, although this may not be desirable from the marketing point of view. Finally, 300 seats are expected to be sold and 3.02e+08 IRR revenue earned. In the last days before departure, sales grow because of customers' purchasing behaviour, which is represented precisely in the figure. Indeed, to increase the potential revenue, the model modifies the prices at the latest times near the departure to achieve more sold tickets.
The optimization model is applied in the very high season and the low season conditions to analyse its behaviour when faced with various situations. The results are presented in Section 1 of the Supplementary material.
In addition, as a practical evaluation, the outputs of the optimization model are compared with the actual company's sales for several cases in Section 2 of the Supplementary material. After statistical analysis, the results imply that the proposed RM methodology has excellent potential to improve the company's revenue.

Sensitivity analysis
Some examples for the sensitivity analysis are prepared to investigate how some parameters, such as remaining days to departure and free capacity, affect prices. Different time horizons and available seats are compared to conduct such an analysis. Table 5 illustrates the sensitivity analysis for three scenarios, and shows that in a similar purchasing time horizon (i.e. Cases 1 and 2), when the remaining capacity is high (i.e. 120 seats in Case 1 vs 50 seats in Case 2), the model adjusts the prices to lower values. Such pricing is consistent with the company's sales strategies to avoid excess unsold capacity. With a similar remaining capacity (i.e. Cases 2 and 3), the model tends to lower prices when there is less time to departure, which is compatible with the last-minute sales strategy.

Conclusion
Because of the high price sensitivity of rail passengers, dynamic pricing can play a vital role in stimulating travel demand for the competition among passenger rail service providers. Hence, setting the prices at any time of the purchasing horizon under varying market conditions was the main objective of the presented RM problem. A case study was presented to illustrate the proposed methodology more clearly. Several real cases were investigated from Fadak Five-Star Trains, a high-quality train company in Iran that serves passengers on long-distance routes. The proposed RM methodology consists of three main steps, which use the ticket reservation data as machine learning data sets. First, an MLP-ANN model predicts the demand intensity due to seasonal situations. This model uses some influencing factors to predict the future travel demand for a given departure date. Then, for each pair of demand condition class and time to departure period, simple polynomial linear regression models are used to analyse the effect of the price set by the company and the prices set by the competitors on the company's sales (i.e. price elasticity functions). In the last step, an NLIP model is proposed to maximize the total revenue in the purchasing horizon. Running the model on data from Fadak Five-Star Trains showed that the proposed MLP-ANN model accurately predicts future travel demand intensity. Other statistical analyses indicated that the regression models set for price elasticity had acceptable levels of goodness of fit. Another study was carried out to assess the performance of the revenue optimization model compared to actual sales accomplished (see Supplementary material). The results revealed that the proposed optimization model has great potential to improve the service provider's revenue if implemented well. Based on the proposed methodology, various research directions can be suggested. Aligning the proposed revenue management method with a capacity allocation for the multi-class trains could be a fascinating practical study. Another study could analyse the effects of different pricing or sales strategies on the total revenue of a company. Moreover, the proposed dynamic ticket pricing methodology could be extended to a multi-train mode. In this study, the effects of the coronavirus disease 2019 (COVID-19) pandemic on travel demand are not reflected because ticket purchasing does not follow a steady behavioural pattern owing to lockdowns or restrictions, and therefore pre-pandemic sales data have been used.

Author contributions
All authors contributed to this research, including conceptualization, data curation, formal analysis, writing-original draft, methodology, resources, software, validation, and writing-review and editing. Professor Reza Tavakkoli-Moghaddam also had the role of project administrator.

Disclosure statement
No potential conflict of interest was reported by the authors.