Optimal Control Inputs for Fuel Economy and Emissions of a Series Hybrid Electric Vehicle

Hybrid electric vehicles offer significant fuel economy benefits, because battery and fuel can be used as complementing energy sources. This paper presents the use of dynamic programming to find the optimal blend of power sources, leading to the lowest fuel consumption and the lowest level of harmful emissions. It is found that the optimal engine behavior differs substantially to an on-line adaptive control system previously designed for the Lotus Evora 414E. When analyzing the trade-off between emission and fuel consumption, CO and HC emissions show a traditional Pareto curve, whereas NOx emissions show a near linear relationship with a high penalty. These global optimization results are not directly applicable for online control, but they can guide the design of a more efficient hybrid control system.


INTRODUCTION
Hybrid vehicles use more than one type of powertrain, in order to combine their advantages. Typically, this is an internal combustion engine paired with an electric motor and battery which provide higher efficiency and the ability to recuperate energy during braking. The drivetrain elements can be arranged in various ways to suit the application and the preferences of manufacturer and customer.
The Lotus Evora's range extender hybrid architecture consists of a battery power electric power train in which the battery can be recharged from the second power source, the internal combustion engine (ICE) generator (series hybrid). This engine can be much smaller than a typical traction engine and it is decoupled from the drivetrain, which means that it can be operated in regions of maximum efficiency. Charging the batteries through plug-in capabilities enables further fuel consumption reductions.
The control strategy of the ICE can make a significant difference to the fuel economy of the hybrid vehicle. Finding the most efficient operating point is not a trivial problem because it also depends on the state of the other system components, especially the battery and the electric motor. The optimal input also depends on the future demand for power. This paper looks at the power control strategy from the point of view of global optimization over a given driving cycle. This eliminates the challenge of predicting future demand, because by definition of the problem the full driving cycle is known in advance. This enables a clever management of the energy sources. The solution can be calculated in an efficient way using Dynamic Programming (DP) techniques as proposed in [1]. The globally optimal solution is not directly applicable as an online controller, because it relies on the prediction of future demand. However, it could be turned into an online algorithm using a receding horizon approach using a limited prediction, or it could provide further insides in how to design a fuel efficient power controller using only available measurements.
The paper is structured as follows: Section II provides the background on the Lotus Series Hybrid; Section III introduces the vehicle model; Section IV and V define the optimization problem and the solution strategy; Section V contains the results and conclusions.

BACKGROUND
The Lotus Evora 414E is a hybrid sports car and a low carbon concept vehicle designed by Lotus Engineering. The architecture of the vehicle is shown in Figure 1. The series hybrid driveline comprises of the 35kW normally aspirated Lotus Range Extender Engine [2] coupled to a permanent magnet generator. The range extender engine, generator and generator inverter forms the Auxiliary Power Unit (APU). The battery pack comprises of 1,792 Lithium Iron Phosphate (LiFePo) cells configured as a 112 Series-16 Parallel pack. Propulsion is provided by two independently driven rear wheel motors. For plug-in functionality, the vehicle is equipped with a 3kW onboard charger. The on-board electronic systems are highlighted in Figure 2. There are two states where the battery energy may be replenished. The condition where energy is returned to the battery via the range extender engine or via the on-board charger is termed a "recharging". The state where energy is returned to the battery through regenerative braking is termed as "recovering". Both conditions may exist simultaneously where the energy returning to the battery is the algebraic sum of the powers from APU charging and kinetic energy recovery.
The battery assumes the operation of a bidirectional electrical power system while the APU assumes operation of a unidirectional power delivery system. Power flow control is achieved by regulating the voltage and currents of the generator inverter as well as engine speed and torque. The power flow convention is illustrated in Figure 3. Bidirectional arrows indicate bidirectional power and current flow.
The standard vehicle employs an adaptive energy management technique to control the power delivery between battery and Lotus Range Extender (LRE) to the electric motor. This approach has proved beneficial in reducing fuel consumption and emissions compared to less adaptive methods [3]. The approach operates online by solving a semi-global optimization problem based on the expected drive cycle.

Previous Approach: Semi-Global
The Evora 414E currently implements a two-stage solution to minimize fuel consumption; the Static Instantaneous Optimization (SIO) performs an initial calculation and the second phase further optimizes the ICE use through Dynamic Compensation Optimization (DCO) [2]. This method looks at the average vehicle power demand over the previous20 seconds and calculates a minimum cost for near future power demand.

Figure 2. Evora Driveline Schematics and Powerflow Conventions
In order to negotiate between the two energy sources, an equivalent fuel consumption is defined for the electrical energy supplied by the battery. This is used to calculate the cost function: Where is the fuel-equivalent cost at time, is the vehicle's electrical power consumption, ̇ is the rate of change in vehicle electrical power, is the battery State of Charge at time , is the fuel cost of auxiliary power unit (APU) energy, is the fuel cost of battery energy, is the battery power.
Research presented in [4] also employs the equivalent fuel consumption concept for a charge sustaining strategy. Control strategies are presented in [5] for a fuel cell and electric battery design as well as a diesel ICE and battery design, giving strong evidence that this method has robustness in real world driving. This technique satisfies the need for real-time system response, however there are limitations in terms of complete optimization for a given drive cycle as the future conditions are not known. The aim of the work in this paper is to explore the losses incurred when the future is unknown.
This approach is semi-global, because it only looks at optimality at one point in time. The equivalent cost of the battery energy helps to strike the main balance between the two energy sources, but it fails to represent any other effect of the engine, such as emissions, the impact of engine start, or further system states.

Proposed Global Approach
To address these issues, a global optimization approach over the full cycle is proposed. An offline solution like this is often used to find the optimal solution from a theoretical point of view, without considering the impact of limited information availability. The approach is non-causal, because the future demands of the drive cycle are assumed to be known precisely, which is not typically the case. Boundary and initial conditions provide constraints to the input and state variables, which can be dealt with using a DP algorithm. This offline approach is to be compared to the ECMS algorithm presented in [3] in the Lotus Evora 414E to assess the optimality of this controller.
The object of the controller is to minimize the emissions and fuel consumption i.e. to minimize the 'cost' of using the powertrain. An important aspect of such a problem is that the decision cannot be singled out; we need to balance the desire for low present cost with the undesirability of high future costs. DP captures this issue perfectly and highlights the tradeoff. The decision at each stage is made based on the sum of present plus expected future costs, assuming optimality for future. The basic model of the problem is a time-discrete system with a cost function which is additive over time.
DP operates by optimizing over a fully-known driving cycle and therefore lends itself to a globalised fuel minimization problem [6]. This method works well with relatively large time steps (1 second or longer), and few input variables [7] and states as the complexity of this problem is exponential to the number of states. Research in [8] presents a MATLAB function for solving a DP problem, the method is successful when applied to the nonlinear, discrete-time, constrained nature of a dynamic model and this is indicative of a hybrid electric vehicle model.
Simulations work is conducted over the NEDC, Artemis and WLTP drive cycles to ensure a large range of driving conditions are covered.

VEHICLE MODEL
The series hybrid architecture contains the Auxiliary Power Unit (APU), which consists of the 35kW normally aspirated LRE [9], permanent magnet generator and inverter. The APU delivers electrical charge to the HV batteries or conversely extra power to the traction motors. With this design the vehicle delivers electric only driving through two 152kW motors at each rear wheel. The engine has stop-start capabilities and this function is to be optimized during a drive cycle. Further details of the hybrid design are outlined in [3].

Engine Model
The optimal efficiency trace of the APU is shown in Figure 3, where the green line represents the optimal APU operation locus. The equivalent fuel consumption for a given APU power output is shown in Figure 4 this is similar to the Brake Specific Fuel Consumption (BSFC), but also include generator efficiency. The efficiency of the APU is consistently good when operating above about 20kW of electrical power.
The emissions profile is given in Figure 5. This graph is a steady state approximation, and it is only partially applicable to transient operation, because the temperature of the catalyst can make a significant difference. At low power, the temperature may be too low for it to be entirely effective, and at high power the increased fuel rate and exhaust flow rate make the catalyst less efficient. This defines a window between about 12kW and 28kW where the emissions are consistently low. In addition, there is a penalty for starting the engine (in start & stop operation), because it takes some time for the air and fuel system to settle after engine start.

Catalyst Model
A catalyst is used in the vehicle to reduce the emissions that are emitted into the atmosphere. The effectiveness of the catalyst varies over the power range of the engine, so the optimal setting for each NOx, CO and HC is different and therefore this introduces a trade-off between them. Typically NOx is produced most prominently at high temperatures when the engine is running at maximum power. At high speed high torque combinations (an area neglected by typical drive cycles), enrichment may be used to provide engine cooling, and this increases CO emissions dramatically. If emissions are considered relevant, this area has to be avoided.

Motor Model
The total power required is calculated from the drive cycle demands, which will be split into a demand from the APU and battery. The electricity to meet these demands is sent to the rear wheel motors, which operate at 95% efficiency. Battery Model The internal resistance of the battery varies according to the battery State of Charge (SOC). The voltage trace for battery depletion is taken at 1% intervals where the corresponding internal resistance increases as SOC becomes low. Safety threshold limits applied to the battery mean it operates between 30%-70% of maximum charge. Battery power is calculated as This means that the total power demand from the drive cycle is a summation of battery and APU power delivery.

OPTIMIZATION PROBLEM
The goal of this work is to solve the global optimization problem by finding the engine power profile with the lowest fuel consumption and the lowest total emissions for a given cycle. To make the problem deterministic, it is assumed that the cycle is known in advance, which would not be true in a real time application. This may be considered "cycle beating" and therefore inappropriate for regulatory purposes, but it is a useful tool to analyze the physical capabilities of the powertrain.
Although the original problem is time continuous, cycles are typically given in time discrete form, and therefore a time finite horizon discrete model is more appropriate. Using a time step of 1 seconds, the model can be described as a function from one state to the next +1 = ( , ), = 0,1, … , − 1 where contains the state variables of the system and the control input variables. The optimization problem is to minimize the cost function through the optimal use of the control input, . The cost function is given as where represents the drive cycle duration and L represents the instantaneous step-by-step function. All aspects of the costs are weighted sums of the fuel consumptions and the emission profile: where the coefficients , , , are weighting factors, and the four cost components are the fuel consumption, Oxides of Nitrogen (NOx) emissions, Hydrocarbon (HC) emissions, and Carbon Monoxide (CO) emissions. The coefficients are essentially the specific cost of fuel and of emissions, and they are chosen so that the overall cost is roughly comparable.
Approaching the problem in this way reduces computational complexity by keeping the number of states low. Only two states are used: state 1 is the battery state of charge, while state 2 denotes whether the engine was on or off in the previous time step. Only one input variable is used, which determines both, whether the engine is on or off and if on, what power it runs at.
In addition to the system model, constraints are applied to the state . In order to reduce battery ageing, the battery SOC is always kept within a reasonably narrow range of approximately 40% of the complete capacity.
The first part: 1 captures fuel consumption and emissions as a function of the engine power. The second part: 2 penalizes a change in the engine operating state from off to on, and it represents the energy and the additional emissions generated during engine start-up. A zero value 2 = 0 is used for time steps when the engine is not started between and +1 .
The final battery state of charge may contain a significant amount of stored energy. In order to create a level playing field, the final battery state is required to be at least as high as the initial battery state. The optimal solution typically leads to a battery state that is only very slightly over this limit. For improved accuracy, the battery state is translated into a cost based on the optimal engine operating point. This equivalent cost is applied as a final state cost ( ) to the optimization problem.

Dynamic Programming
The optimization problem can be solved globally using dynamic programming. The main part of the optimization goes backwards from the final cost ( ) to find the best input −1 and best cost to go −1 for the transition to the last step, described by −1 : −1 ( −1 ) = min −1 ( −1 , −1 ) + ( ) (7) Therefore, by the definition of dynamic programming, the cost function in (7) is additive over time; a combination of the expected cost in (4) and the final cost.
This process is repeated to find the optimal cost and the optimal input for each state at each time step. The computational complexity is polynomial, making this one of the most efficient optimization algorithms. Once the cost map has been found, a forward simulation is performed from the given initial state to find the optimal trajectory and optimal cost solution. Therefore as uk calculated for each xk minimizes the right hand side of (7), for each state the policy of control decisions from n-1 to 0 is optimal.

Implementation
The function dpm.m in version 1.1.2 is used to implement the optimization algorithm [10]. This function is a generic implementation of DP for discrete and continuous variables that has been successfully applied to electric vehicles [8]. In addition to this function, a problem definition and a cost function are required. The cost function contains the vehicle models and cost aspects discussed above.
The computational complexity of the implementation is moderate, because only two system states are used in the model: battery SoC and engine on/off. Overall, the model has to be evaluated at about 8 × 10 7 grid points, which takes just under a minute of computation time on T4200 CPU. The model is already vectorised, which means that the numerical computations should be reasonably efficient, and overhead is very much reduced over an iterative approach. While some further performance improvement could be expected by rewriting the function in 'C' coding language, it is not considered worthwhile at this point of the work.

State Limits
Dealing with limits in the system is a challenge for DP. The most important limit is the minimum and maximum battery SOC. Because of the influence from the drive cycle, this limit changes over time. For example the upper limit is 0.7 of the total capacity, but before a regeneration phase the battery state must be less than this to allow of the increase in charge due to regeneration. Controlling charge (and wasting the recuperated power) is possible, but it was not considered relevant in the context of optimizing fuel economy.

State Discretization
The dpm code automatically performs discretization of the state space and the input space, and it interpolates the cost function as required. This introduces a small element of error, which can become more significant in the vicinity of limits.
The discretization creates an issue for tracking the state limits, because the physical limits may not fall on the discretized states. The DPM function offers three approaches to limits: apply a penalty cost (MyInf), which needs to be high enough to avoid violating the limits, track the boundaries exactly via interval bisection (Boundaries), or avoid any discretized states that could potentially lead to a violation (LevelSet). The boundary method is not applicable for more than one state (although it may be possible to extend it, because the limit only affects one of the states). The level set option was found to be too conservative, leading to distinctly suboptimal results unless a very high number of states are used. Therefore the appropriate option is to use the penalty cost approach. The penalty was carefully chosen to avoid limit violations without causing the controller to be too conservative. If the penalty is set too high, it can "bleed" into perfectly possible states due to the repeated cost interpolation, and if it is too low, a constraint may be violated in the final simulation.
The sensitivity of the control input and the resulting cost to the grid size was determined experimentally, and it was concluded that a reasonably high number of grid point of the battery SOC is required for an accurate solution. Therefore 1001 states are used for this dimension. Higher values had little effect on the solution, but increased the computation time beyond reasonable limits, while lower number of states lead to distinctly inferior solutions.

Input Discretization
The input vector also requires discretization before the optimum solution can be selected. Since only limited experimental data was available, the same steps (full kW) are used here.
The optimal input can be interpolated between the neighboring states, but this did not lead to acceptable results. The reason is the non-convex BSFC curve (see Figure 2), which leads to distinct minima for different circumstances. An interpolating controller may instead use an input between two minima, which means that it could it a local maximum.
In order to avoid the input interpolation, the input was declared as a discrete variable. This means only full kW steps can be commanded. The available inputs are aligned with the experimental data.

Optimal APU Usage
The use of the APU is compared for a strategy to minimize fuel and a second strategy to minimize emissions. Fig. 3 shows the comparison of two strategies for optimizing the APU usage to minimize fuel (red) and NOx (green). For the minimal NOx strategy the engine does not turn off once started and this is due to the penalty cost associated with startup scenarios. More engine start/stop scenarios occur in the minimal fuel strategy, which shows that this condition doesn't penalize the fuel cost as much.
Optimizing for minimal NOx shows a dramatic decrease, from 1.532g to 0.323g (see Table 1). In terms of meeting European emissions regulations [11] the total NOx emissions must be below 1.036g for this particular Artemis Road drive cycle. The solution employed here is to find the necessary trade-off in the cost function to reduce the total NOx output just below this limit. The resulting solution then provides the best fuel consumption satisfying the emission limits. This approach avoids the computational complexity of formulating NOx emissions as an optimization limit, because that would require a further state in the system to track emissions.
The two strategies show different desirable power ratings to run at for long periods of time. The red line for fuel minimization runs at 26kW for the majority of the drive cycle. This can be attributed to the lowest BSFC at this power rating of 250g/kWhr (see Fig. 2). The peaks and troughs up to maximum and minimum APU power ensure that the battery SOC remains within feasible limits and returns to the starting value at the end of the cycle.
For the NOx minimization strategy the APU frequently runs at 28kW, which corresponds to the region of the catalyst model that delivers maximum conversion rate. The APU runs at minimal in this strategy as opposed to switching off due to the penalty associated to NOx for an engine off/on scenario. Incidentally the fuel minimization strategy opts to run at 5kW instead of 4kW due to the improved efficiency.
The results obtained from this global approach differ from the simulation work for an online solution shown in [2]. Comparing the APU profile to the online controller in the Lotus Evora 414E, far fewer operating points are seen for the global solution. Generally the engine is working at four main points for fuel minimization: off (0kW), minimum (4kW), maximum (35kW) and most efficient setting (26kW); whereas an adaptive online solution shown in Fig. 4 sees much more variation in operating points. Due to the nature of the BSFC curve (Fig. 2) there will be small benefits to using maximum power and similarly using minimum power instead of switching off. This optimal behavior can be introduced into an online control management system. 1.532 0.323 The SOC of the battery takes different paths for the two strategies, more continuous electric driving is apparent in the NOx minimization strategy, which requires a large amount of charging after 400sec into the cycle. This characteristic is the cause of penalizing the fuel consumption. The battery is slightly more efficient at higher voltage (higher state of charge), therefore the trace for fuel minimization operates the battery at consistently higher charge levels. weighting is applied at increasing intervals generating a range of optimal solutions within this multi-criteria optimization problem. Fig. 9 shows the trade-off for NOx and fuel, where a linear relationship is apparent in the central part of the graph. Therefore reducing NOx comes at a high cost for fuel. The European legislations require that the level of NOx is below 60mg/km, and this is always satisfied. The NOx Pareto curve is surprisingly straight, which indicates that there are essentially only two solutions, and these are blended in different ratios. Fig. 10 shows the relationship of CO emissions and fuel consumption. This graph shows a more traditional relationship according to the law of diminishing return. Initially, the fuel penalty for reducing CO emissions is small; but as the emissions reach the lower limit, the fuel penalty increases significantly. The legislation limits CO to 1000mg/km [10], which the algorithm can achieve with good scope to achieve a level below 600mg/km, if desired. The Pareto curve in Fig. 11 again shows a strongly curved relationship, this time for HC emissions. The first four data points show potential to reduce HC at low fuel cost; thereafter the penalty increases for a smaller reduction in emissions. The simulations in this result all achieve the legislation limit of 100mg/km [11] so therefore it would be desirable to select a solution that favors minimal fuel consumption. It is interesting to compare the behaviors of the three emissions: the absence of a diminishing return for the NOx trade-off is clearly standing out. The linear relationship indicates that the solution is moving from one strategy to another without any better intermediate solution.
Further optimization work could include different engine calibrations, i.e. spark timing or lambda control could bring further emission benefits with a moderate fuel economy penalty. For the energy management system it is desirable to apply a penalty cost to each of the four given variables, fuel, NOx, CO and HC. This will allow a compromise across the board to meet emission legislations and acceptable fuel consumption. The design of the cost weights will depend on the relative desire to minimize each variable. An example is shown in Table 2 below that meets the European emission legislations, whilst also maintaining the fuel consumption close to its theoretical global minimum. These combined weights are shown in Figs. 9-11.

Causal Implementation
As stated above, the globally optimal controller is specific to a certain drive cycle. This means it assumes knowledge of the future, which makes the control non-causal and therefore not directly suitable for real time applications.
In an experiment to investigate the validity of a causal controller, optimal APU powers at each second in a large database of measured and synthesised European drivecycles [12,13] were calculated with dynamic programming. From the simulation output, a dataset of vehicle speed, acceleration, power-demand, battery SoC and distance from the end of the cycle were recorded and fed into the Weka open-source machine learning software [14]. An ensemble of J48 decisiontrees were trained to classify optimal APU power to the nearest kilowatt, using the AdaBoost M1 algorithm with ten folds for cross-validation.
When distance from the end of each cycle was included in the training data, the optimal APU power was determined by the classifier with a precision of 94.3%. When distance-to-go was neglected from training data, classification precision reduced to 87.6%. Classifier performance implies that near optimal control behaviour could be generalised with machine intelligence for use in a causal energy management system.
The results are shown in Figures 12 to 15 over different cycles blue is the non-causal optimal controller (dpm), and red the causal suboptimal controller (weka).

OUTLOOK
A successful global optimization algorithm has been implemented over various drive cycles for a model of the Lotus Evora 414E. This optimal solution presents a benchmark for sub-optimal controllers, with the aim of minimizing losses for driving conditions where the future is unknown. The current energy management system in the vehicle can now be assessed against the optimum global solution. Work will progress to develop an online controller that achieves results as close to the global solution as possible.
Benchmarking simulations can have a significant effect on design parameters in future powertrains. The optimization results discussed in this paper clearly highlight the abilities of the particular powertrain in the Lotus 414E.
The implementation of the current simulation is done directly in MATLAB using a code which is customized from the original modular function explained in [8]. Although this paper has highlighted the reasons why a global optimization solution as one presented here cannot be used in real-time; with some forward thinking a strategy can be implemented for use of such 'smart optimizations'. The cost functions derived during DP can be used to construct a semi-global optimal controller that is well suited to any similar driving style. Further research is planned in this direction, to look at ways to include demand prediction and adaptation to different driving styles.

ACKNOWLEDGMENTS
We would like to thank Lotus for the generous support in preparing this work and this publication. Without their diligent work on the design and modelling of hybrid vehicles, this publication would not have been possible.