Prediction model for biochar energy potential based on biomass properties and pyrolysis conditions derived from rough set machine learning

ABSTRACT Biochar is a high-carbon-content organic compound that has potential applications in the field of energy storage and conversion. It can be produced from a variety of biomass feedstocks such as plant-based, animal-based, and municipal waste at different pyrolysis conditions. However, it is difficult to produce biochar on a large scale if the relationship between the type of biomass, operating conditions, and biochar properties is not understood well. Hence, the use of machine learning-based data analysis is necessary to find the relationship between biochar production parameters and feedstock properties with biochar energy properties. In this work, a rough set-based machine learning (RSML) approach has been applied to generate decision rules and classify biochar properties. The conditional attributes were biomass properties (volatile matter, fixed carbon, ash content, carbon, hydrogen, nitrogen, and oxygen) and pyrolysis conditions (operating temperature, heating rate residence time), while the decision attributes considered were yield, carbon content, and higher heating values. The rules generated were tested against a set of validation data and evaluated for their scientific coherency. Based on the decision rules generated, biomass with ash content of 11–14 wt%, volatile matter of 60–62 wt% and carbon content of 42–45.3 wt% can generate biochar with promising yield, carbon content and higher heating value via a pyrolysis process at an operating temperature of 425°C–475°C. This work provided the optimal biomass feedstock properties and pyrolysis conditions for biochar production with high mass and energy yield. GRAPHICAL ABSTRACT


Introduction
One of the challenges for today's society is the everincreasing demand and depletion of fossil fuels.In addition, the burning of fossil fuels contributes to greenhouse gases and global warming.To reduce the dependence on fossil fuels and the concentration of greenhouse gases in the atmosphere, the development of the material for energy storage and conversion has become more crucial.On top of that, the material must be cost-effective and of high performance for it to be industrially viable.In view of this, biochar has emerged as one of the effective options for energy storage and mitigation of global warming through carbon sequestration.Biochar is a charcoal-like substance that can be generated via thermochemical conversions of biomass such as pyrolysis, hydrothermal carbonization, gasification, and torrefaction.Among the biomass-to-biochar conversion processes, pyrolysis has received great attention for its advantages such as low capital cost and operating cost and versatility in applications.In the pyrolysis process, the biomass is heated at temperatures usually above 400°C under an oxygen-deprived environment, producing biochar, biooil, and syngas.
Biochar has many unique beneficial attributes such as a porous structure, a high specific surface area (SSA), a high organic carbon content, a cation exchange capacity (CEC), and a stable structure.These unique attributes have sparked the interest of scientists and researchers around the world to investigate the application of biochar for environmental remediation, agriculture, and energy storage.Several studies and applications of biochar-based energy storage such as fuel cells [1], oxygen electrocatalysts' hydrogen storage and production [2], supercapacitors and lithium/sodium batteries [3] were reported.In addition, Zaini et al. showcased steam co-gasification of biochar with landfill waste.Syngas generated through this method contains higher H 2 concentration, while the gasification efficiency was also improved with the addition of biochar [4].The promising results from these studies demonstrated the potential of biochar-based material for energy storage and conversion.
The yield of biochar needs to be high so that most of the original biomass remains in the solid fraction, instead of the gas or liquid fraction.In addition, the carbon content of biochar should be high to utilize it for energy applications.The energy content of biochar is normally reported in higher heating values (HHV).The quality and the yield of biochar were often related to the biomass characterization (ultimate and proximate analyses) and pyrolysis operating conditions (operating temperature, heating rate, and residence time).For example, the pyrolysis temperature was reported to affect the structural and physicochemical properties of the biochar including its SSA, pore volume, surface functional groups and elemental compositions due to the decomposition of organic matter, release of volatiles, and the formation of micropores at higher temperatures [5].Other than the pyrolysis temperature, other operating conditions such as the heating rate and residence time also have a great influence on the biochar's yield and quality.During the pyrolysis process, reactions, such as depolymerization, fragmentation and crosslinking, occurred on the lignocellulosic component of biomass [6].Under different operating conditions, biochar yield and quality varied significantly due to the occurrence of the different reaction mechanisms of different components of biomass.Not only that, the properties of biochar, such as carbon content, higher heating value (HHV), specific surface area (SSA), pore volume, pH and cation exchange capacity (CEC), can be significantly different due to a variety of operating conditions and biomass types.The selection of optimal conditions for the production of biochar with desired properties requires extensive knowledge of dependencies and influencing factors, quantitatively and qualitatively [7].Therefore, a detailed study is needed to understand the underlying relation between pyrolysis operating conditions, biomass type and biochar properties to achieve large-scale production of biochar.
Machine learning (ML) is based on automated datadriven model building, where ML algorithms identify pattern recognition based on uncertain information, such as rough set theory, evidence theory and fuzzy set theory.The approximation approach of these theories allows them to discover structural relationships within imprecise and noisy data.ML tools have been applied in biomass thermochemical conversion processes.Cheng et al. reported a random forest ML model on slow pyrolysis.They have incorporated lifecycle assessment and economic analysis for biochar production [8].Khan et al. had done a biochar yield prediction model using an artificial neural network (ANN) coupled with metaheuristic algorithms.This study identified that the most influencing factors for biochar yield were pyrolysis temperature, residence time, and heating rate [9].Zhu et al. developed a prediction model for the yield and carbon content of biochar utilizing random forest ML [10].The results revealed that a random forest could reliably predict biochar yield and carbon content based on some biomass properties and pyrolysis conditions.On the other hand, various artificial intelligence (AI) techniques, such as the ANN, adaptive neuro-fuzzy inference systems (ANFIS) and least squares support vector machine (LS-SVM) models, were employed to determine the heat capacity of pyrolysis biochar as a function of pyrolysis temperature, posttreatment process and temperature [11].Statistical analysis of the study concluded that the LS-SVM model demonstrated the best predictive performance.The work was further extended to characterize the biomass heat capacity based on its bio-sourced type and appearance shape and temperature [12].The ranking analysis shows that the cascade feedforward (CFF) neural network is the best-performing model in the study.
A recent review of ML studies for pyrolysis/gasification of biomass concluded that these models can encourage industrial-scale deployment as the ML models minimize uncertainties [13].However, black-box ML techniques often suffer from poor inherent interpretability.Rudin argues that interpretable ML techniques facilitate the extraction of useful interpretations from datadriven models [14].In this view, the rough set machine learning (RSML) model is a promising approach to address problems, where rules are generated from the data embedded during the training process [15].The credibility of final rules will be evaluated in view of physical mechanisms [16].Rough set theory (RST) can handle data with poor or inconsistent quality as it allows setting with ill-defined boundaries.RSML was developed to apply this theory for classification, prediction, and decision analysis tasks.'IF-THEN' rules in the RSML approach provide direct and transparent interpretations.Furthermore, the decision rules generated can also be validated through scientific coherency.In RSML, the best model is selected by analyzing both the statistical performance and subjective judgement.RSML categorizes all objects into different classes.These algorithms can be further utilized for approximation, decision rule induction, and object classification.For example, the RSML model evaluated the carbon dioxide reservoirs by the storage integrity based on geological data [17].Besides, the RSML approach was applied in computeraided molecular design to model non-quantifiable attributes such as fragrance [18].Chong et al. employed the RSML model-generated rules to estimate the fuel properties of pyrolysis bio-oil (the higher heating value (HHV) and pH) based on pyrolysis temperature and feedstock characteristics [15].
Biochar has become more and more valuable due to its attractive properties.It can be used for many different engineering purposes such as for energy storage, as an adsorbent, and for soil amendment.Therefore, it is necessary to find optimum feedstock properties and operating conditions for large-scale production of biochar for different applications.In this study, the energy application of biochar was considered.The energy content of biochar is normally reported in HHV which is closely related to the carbon content.The energy yield of biochar was reported as the yield percentage and the relative HHV of biochar.The empirical models for the prediction of biochar yield, carbon content and HHV from biomass properties and pyrolysis conditions are still limited.Therefore, a data-driven RSML model was developed to estimate the biochar properties from its biomass characteristics and pyrolysis-operating conditions.In this study, the biochar's mass yield, heating value and carbon content were predicted based on the feedstock characterization, including the proximate and ultimate analyses (i.e.volatile matter, fixed carbon, ash content, C, H, N, O), and the pyrolysis-operating conditions (i.e.operating temperature, heating rate, and residence time).Here, the RSML has been used to handle multi-input and multi-output tasks.In this study, a biochar database was collected and compiled from various research articles.In this study, three case studies with different decision attributes were presented.Based on the rules generated, the optimum feedstock properties and the operating condition are determined to produce biochar with high mass and energy yields along with high carbon content.

Methodology
Figure 1 illustrates the methodology of biochar properties' prediction model development through RSML.

Step 1: identification of biochar requirements
The main application of biochar includes carbon sequestration, soil improvement, energy storage, and feedstock for energy conversion material.The main requirement for biochar is yield which is the quantity indicator for the pyrolysis of biomass.In addition, a high-carboncontent is desirable for biochar as an energy feedstock and storage.As an energy storage and conversion material, HHV is an important attribute of biochar which indicates its energetic potential [19].Therefore, the identification of biochar attributes such as yield, carbon content and HHV is crucial [20,21].

Step 2: data compilation
In this study, the biomass properties, pyrolysis conditions, and biochar properties were the inputs to the rough set of machine learning.Data were collected from various research articles.WebPlotDigitizer was utilized when the data were only given in a plot.Overall, the data consist of 445 samples from 49 literature sources.In this study, 70% of the database was used as training data, while the remaining data were used as the validation sets.

Step 3: translation and classification of biochar attributes
The conditional attributes were identified and translated into measurable properties.These attributes include pyrolysis operating parameters (temperature, residence time and heating rate), ultimate analysis, and proximate analysis of biomass.The requirements for biochar application (e.g.energy storage and conversion material), were translated to carbon content and HHV with biochar yield being the quantitative indicator of the process.These requirements of biochar are classified based on their values into number-coded classes, as shown in Table 1.Based on the data collected, the interval of each class was defined to evenly distribute the amount of data between classes.

Step 4: development of the RSML model (rules generation in rough sets)
A simplified biochar yield is shown in Table 2.Each row in the decision table is an object, while the columns show attributes.The attributes are further categorized as conditional attributes (i.e.pyrolysis parameters, biomass properties with their respective biochar properties) to describe the object and decision attributes to indicate the classes of the object (i.e.yield, carbon content and HHV of biochar sample).Biochar yield, ranging from 10% to 39% was classified into three class intervals: Class 1 with a yield of 10-19%; Class 2 with a yield of 20-29%; and Class 3 with a yield of 30-39%.
Indiscernible objects have similar attributes or features.Thus, after the information table was completed, the reduction of conditional attributes was done by removing existing data that do not affect the final decision.Hence, fewer attributes are considered, which help in reducing the redundancy of data.In the rough set theory, reduct refers to a subset of indispensable attributes and the intersection of all reducts is known as the core.The core set cannot be ignored by the decision system.In Table 2, the indiscernibility (IND) of the object (biomass) between each conditional attribute (C1, C2, and C3) was introduced.Let R be the family of equivalence relation, the equation of indiscernibility of the biomass B1 to B6 is shown in equation (1).
Next, the indiscernibility of attributes set {C2, C3} is given by equation ( 2), whereas the attributes set {C1, C3}, and {C1, C2} are defined by equations ( 3) and ( 4), respectively.By looking at equations ( 3) and ( 4), both indiscernibility of attributes set {C1, C3} and attributes set {C1, C2} were equivalent to the indiscernibility relation of R, IND (R), as shown in equation ( 1).This indicated that attributed C2 and C3 are dispensable since the removal of the attribute does not affect the results from the original table.However, attribute C1 was indispensable, which affects the result, as shown in equation ( 2), if it is removed.
attribute C3 is dispensable.This shows that the classification from the original table with all of the attributes {C1, C2, C3} is identical to the classification determined by attributes {C1, C2} and {C1, C3}.Thus, the attribute sets {C1, C2} and {C1, C3} were the reducts for the dataset R. On the other hand, attribute C1 was the core of the decision system.
The reduct sets were then used to derive a set of decision rules to predict the biochar yield in the later stage.From the reduct sets {C1, C3}, a set of decision rules are generated by applying a rough set algorithm to predict biochar properties.The rules generated from this example are The cores, reducts and decision rules were generated from Learning Examples Module, version 2 (LEM2) algorithms in Rough Sets Data Explorer (ROSE2) software.Decision rules in the form of 'IF-THEN' were generated using the LEM2 algorithms' approach by considering the upper and lower approximations.For instance, the 'IF-THEN' rule can be interpreted as 'if the conditional attribute is A, then the decision attribute is B', where A and B are the hypothesis and conclusion, respectively [22].The rules generated will be filtered based on the strength, coverage, and certainty of the rules, which are defined in step 5.The selected rules will be designated for the modelling of biochar properties' prediction.

Step 5: Validation of rules
The rules generated from RSML may not always be deterministic, so the performance of the rules can be determined by three factors.These are coverage, certainty, and strength which are defined from a Bayesian probability perspective [23].The strength (σ x ) of a rule is defined as the fraction of the number of objects from the information table that obey a certain rule (supp x (C, D)) over the total number of objects (card(U)), as shown in equation (5).The definition of a rule's certainty (cer x ) as given by equation ( 6) is the probability that an object fits into a set of characteristics being classified into a decision class.The coverage (cov x ) of rules, as shown in equation (7), can be calculated by dividing the number of objects being classified by a given rule by a class supp x (C,D) with the total number of objects in the same decision class, card(D(x)).
The generated rules from the RSML were validated by using the remaining 30% of the data.The rules were then ranked based on the verification results.The performance of rules was calculated using certainty, coverage, and strength.Conflicting rules were identified and analyzed.A combination of rules could be done to increase the coverage of the model.The rules with the most certainty and coverage were selected as the prediction model of the research.The steps were repeated for different biochar attributes.

Step 6: Selection of rules
Once verification was done, the rules generated from the algorithm were ranked based on the performance of the rules in the validation phase.The rules with high values of these factors were taken into consideration for the prediction model.These rules with good performance were selected based on their scientific coherency and were used for the prediction model.

Results and discussions
In this study, yield, carbon content, and higher heating values were chosen as the decision attributes of biochar.The conditional attributes were the pyrolysis temperature, heating rate, residence time, and proximate and ultimate analysis of biomass.The relationship between conditional attributes and decision attributes was derived from the rules and reducts generated and will be discussed in detail in the following sections.
The information table for the study of biochar yield contains 237 sets of data.197 sets of data were collected for biochar carbon content and 113 sets of data for the study of biochar HHV.The 10 conditional attributes were the proximate analysis of biomass (volatile matter, fixed carbon, and ash content) ultimate analysis of biomass (C, H, N, O) and operating conditions of pyrolysis (temperature, heating rate and residence time).70% of the data were used as training data and the rest of the data were used to validate the rules generated.From the results, three cores were generated with six reducts when the yield was the considered decision attribute, two cores were generated along with six sets of reducts when the carbon content was the decision attribute, while one core and seven reducts were generated when the decision attribute was the HHV.
As shown in Table 3, the operating conditions of pyrolysis, especially operating temperature, appeared to be the most important attributes that govern the yield, carbon content, and HHV of biochar generated.The cores generated make sense from a scientific point of view.The temperature affects biochar yield due to the decomposition of the lignocellulosic component of biomass at specific temperatures.At high temperatures, oxygenated species, the more reactive component in biochar, will reduce, while the more stable carbon remains unchanged.Thus, the carbon content of biochar is also affected by temperature.

Rules generated for biochar yield
From the ROSE 2.0 software, 249 rules were induced from all three reducts and 15 of those classified objects class 1 of biochar yield, 69 rules classified objects into class 2, 85 rules classified objects into class 3, 56 rules for class 4, and 24 rules for class 5.These rules were then subjected to validation using 30% of the data.After the validation, the rules with coverage higher than 10%, and certainty higher than 60%, were chosen and the rest were eliminated since the power of the rules is low.The rules chosen after validation are shown in Table 4. Rule 2 from reduct 1 describes that to get biochar yield less than 20%, the operating temperature is higher than or equal to 475°C, a heating rate lower than 1°C/min with a certainty of 100%.From the results generated by ROSE 2.0, this rule is supported by 4 out of 166 data, hence giving it a strength of 2%, four out of five class 1 data meet the rule criteria, giving this rule a coverage of 80%.
Similarly, rule 24 from reduct 2 can be described as when biomass has ash content (AC) between 6.6 and 10.695 wt% and the operating temperature of pyrolysis is between 375°C and 525°C, a biochar yield of 30-40 wt% can be obtained with a certainty of 100%.The biochar yield increases when the temperature decreases.This makes sense from general knowledge of pyrolysis of biomass, where bio-oil and biogas yields increase with the increase in temperature and the opposite for biochar yield due to the thermal decomposition of the lignocellulosic component of biomass.

Scientific coherency of rules (biochar yield)
Two types of reactions occur during the pyrolysis of biomass.The primary reaction is the thermal decomposition of cellulose into char at a relatively low temperature.The secondary reaction occurs at a higher temperature, where the rapid volatilization converts the biochar, formed in the primary reaction, into bio-  oil and syngas to lower biochar yield [24].Hence, a low temperature is preferable for obtaining more biochar yield as it disallows the secondary reaction.At a high heating rate, the heat provided will increase the temperature of biomass and the heat flow inside biomass to lead to the volatilization of biomass into volatiles.Therefore, the effects of the secondary reaction become significant, while at a low heating rate biochar is produced more [25].Ibn Ferjani et al. showed that the biochar yield decreased with temperature and plateau at 500°C, the biochar yield was lowered at a high residence time [26].
The volatile matter is the matter loss when biomass is subjected to thermal degradation.VM is devolatilized to the vapour phase during the pyrolysis of biomass.A trend of decrease in the volatile matter of biomass is exhibited in the rules when biochar yield is higher.During the pyrolysis of biomass, the degree of carbonization is an important indicator for the yield and the quality of biochar produced.As the carbonization of biomass occurs, the volatile matter escapes from the biomass and biochar is left as the residue.Therefore, lower volatile matter correlates to a higher biochar yield since there will be less loss of volatiles during pyrolysis.The weight loss of biochar occurs in two phases.At temperatures below 600°C, weight loss is mainly contributed by the evolution of light organic species and gases, above 600°C, the volatilization of H 2 O, CO 2 , CO, CH 4 and H 2 dominates the weight loss effect.
Based on the rules shown in Table 4, a similar trend can be observed.At low biochar yield, the rules suggested high pyrolysis temperature, heating rate and residence time promote the secondary reaction.For example, rule 12 in reduct 1 shows that (VM ≥ 82.85) & (FC ≥ 12.985) & (T ≥ 475) & (HR ≥ 6) for class 2 of biochar yield.Rule 47 in reduct 1 provides (VM < 59.615) and (T < 325) for class 5 yield.The decrease in VM can also be observed from the rules, which contribute to the improvement in yield class.
Many studies have shown that the de-ashing pretreatment such as acid wash on biomass had an impact on biochar yield.A study done by Mazerolle et al. compared the biochar yield of the same type of biomass with and without biomass washing.Biomass without washing yielded more biochar than that washing due to its high ash content [27].Chow et al. experimented by co-pyrolyzed biomass with palm oil sludge consisting of high ash content.The results showed that with high amounts of sludge, the biochar yield increased.The amount of ash content in biomass had a positive effect on biochar yield [28].Moreover, the rules based on proximate analysis of biomass have also complied with the findings of the studies.For example, the rules from reduct 1 showed an increasing trend of biochar yield with decreasing volatile matters.In yield reduct 2, an increasing trend of ash content was shown at a higher biochar yield.The rules stated that class 2 (20 wt% < Yield ≤ 30 wt%) requires biomass with lower than 6.6 wt% of ash content, while class 3, 4 and 5 rules predicted as the biochar yield > 30 wt% when the pyrolysis conduct for biomass with ash content higher than 6.6 wt%.

Performance of rules on the validation set (biochar yield)
The rules generated were tested against 30% of the rest of the data which consist of 71 samples.Application of these rules on this dataset allowed the selection of rules based on their coverage and accuracy.Figure 2 shows the visual illustration of rule validation, the amount of data that fit into a specific rule and the accuracy of classification.For example, rule number 3 has 11 sample points fitting into the description of the rule with 8 of those correctly classified as class 2. This gives an accuracy of 73% for rule 3 (Figure 2).There are 19 sample points in the validation set classified as class 2, where 8 of them correctly fit into rule 3, which gives rule 3 coverage of 42%.
As shown in Figure 2, 18 rules from reduct 1 have an accuracy higher than 60%, out of which 9 of those have a coverage higher than 10%.These rules are selected as the rules for the prediction model since they provide acceptable prediction power in terms of coverage and accuracy.Similarly, the rules from reduct 2 up to reduct 6 were also selected based on the same criteria and the rules are shown in Table 4.

Rules generated for the biochar carbon content
Through ROSE 2.0 software, the cores generated for the biochar carbon content study were pyrolysis temperature and residence time, as shown in Table 3. Six reducts were generated which include proximate and ultimate analysis of biomass.A total of 279 rules were generated from the software, with 14 rules for class 1, 22 rules for class 2, 77 rules for class 3, 103 rules for class 4 and 63 rules for class 5.

Scientific coherency of rules (biochar carbon content)
Biochar carbon content contributed significantly to the combustibility of biochar.High pyrolysis temperature and biomass carbon content favour the production of biochar with a high carbon content [29] which is confirmed by the current study.When the carbon content of feedstock increases, biochar carbon content was greater [30].Although the decarboxylation and carbonylation occur in biomass during pyrolysis, in which CO, CO 2 and other carbon compounds were released, a great portion of carbon can stay along with biochar [31].Many studies had also discovered that the H/C ratio of biochar decreases as the pyrolysis temperature rises [32,33].The study on biochar carbon content based on biomass characteristics was done by Zhu et al. and showed that biochar carbon content decrease with high ash contents of the feedstock [10].Since ash content in biomass was difficult to decompose during the pyrolysis process, the relative carbon content was low in the case of higher ash contents for heat transfer retardation.
As shown in Table 5, the rules generated for biochar carbon content were consistent with the scientific theory and studies.Most of the class 5 rules suggest high temperature (≥525°C) and carbon content were needed to produce high carbon content biochar.On the other hand, low operating temperature and high retention will yield biochar with low carbon content.By observing reduct 3, the decrease in ash content will yield a higher biochar carbon content.With a high ash content, the organic compounds were protected from thermal degradation which decrease the carbonization process of biomass.Although rule 16 from reduct 3 seems to contradict this theory by suggesting AC in [11.2, 14.25), the effect of having high AC is compensated by a high temperature (≥325°C ) and a residence time (≥0.58 h).There was no obvious trend that can be found on the biomass volatile matter and fixed carbon content.On the other hand, the carbon content in biochar decreases as the VM in biomass increases due to the devolatilization of C in biomass into the vapour phase.This can be seen from rule 1 in reduct 1 where the low carbon content was observed for biochar (class 1) if VM in [79.69, 80.645)) & (N < 0.545).Meanwhile, rule 39 provides (VM in [68.9, 77.42)) & (N in [1.675, 2.195)) & (T > = 425) in class 5.However, the presence of N in biomass captures some carbon to make organic nitrates in biochar, thus resulting in higher carbon content in biochar [34].To conclude, the higher carbon content in biochar can be achieved with lower VM and higher N content.

Performance of rules on the validation set (biochar carbon content)
Fifty-nine sets of data were used to validate the induced rule from ROSE 2.0, the rules were selected based on the validation for the prediction model.Figure 3 shows the validation results from carbon content reduct 1.The selected rules are shown in Table 5.There were very few selected rules for classes 1 and 2 compared to other classes.This is because biochar that has <40% of carbon content due to the nature of biochar being a high-carbon compound.Besides, the low carbon content in biochar is undesirable in most cases, especially for carbon sequestration and soil amendment.Figure 3 shows the visual illustration of rule validation, the amount of data that fits into a specific rule and the accuracy of classification.For example, rule number 34 has 6 sample points fitting into the description of the rule with 6 of those classified as class 3.This gives an accuracy of 100% for rule 34 with a coverage of 20%.

Rules generated for biochar the higher heating value
For the biochar HHV study, only 1 core was generated that is the operating temperature.The deterministic rules generated were based on the reducts shown in Table 3. 200 rules had been generated with 14 deterministic rules for class 1, 37 rules for class 2, 55 rules for class 3, 63 rules for class 4 and 31 rules regarding class 5 in terms of biochar HHV.

Scientific coherency (biochar higher heating value)
The heating value of biochar can be determined by a bomb calorimeter or calculated by using a mathematical model from its elemental composition.There is much modelling of HHV prediction based on ultimate analysis and proximate analysis has been done.Qian et al. established a mathematical model for HHV calculation from biochar proximate and ultimate analysis with low average absolute error and average bias error [21].There are other models to estimate the HHV of biochar from biochar properties [35].However, these models only provide the HHV calculation from the product properties or composition itself.On the other hand, the model generated from RSML was able to predict biochar HHV based on its feedstock properties and pyrolysis conditions.
A higher heating value (HHV) indicates the potential of biochar as biofuel.He et al. concluded that the biochar HHV increases noticeably as the temperature increase from 300°C to 600°C [24].The release of volatile matter and the composition of fixed carbon raise the carbonization degree as the pyrolysis temperature increases.The carbon content in the biochar rises, while the oxygen content falls dramatically, resulting in a steady increase in the HHV of biochar.Moreover, the energy conversion efficiency of biochar decreases at a higher pyrolysis temperature (up to 500°C) due to the  decrease in biochar yield.At lower temperatures, most of the energy was retained in the biochar while at higher temperatures the energy was converted to syngas as the gas yield increased with temperature.
Based on the rules generated, a temperature around 450°C seems to be the difference between the class 3 and class 4 rule.Most of the class 4 rules suggest higher than or equal to 475°C and class 3 rules suggest temperature smaller than 425°C.The negative effect of AC on HHV was shown in the rules.Rule 13 in reduct 4 provided AC in [6.6, 7.47)

Performance of rules on the validation set (biochar higher heating value)
To validate the accuracy and coverage of the rules, 34 sets of validation data were used.Figure 4 shows the validation results of rules from HHV reduct 2. The rules with high accuracy and coverage were selected and shown in Table 6.None of the rules for class 1 was selected due to its low accuracy and coverage.This can be also caused by a very little amount of class 1 data in the validation set.Additionally, the low amount of data in the training set can also cause the rules generated to have weak classification power.Naturally, biochar tends to have higher HHV due to its high carbon content, thus explaining the lack of data in class 1.The lack of class 1 rule suggested that it is difficult to produce biochar to have a lower HHV than 15 MJ/kg.Biochar will most commonly have HHV within 20-30 MJ/kg, RSML was able to provide classification between class 3 (20-25 MJ/kg) and 4 (25-30 MJ/ kg) with relatively high accuracy and coverage.

Determination of optimum attributes for the desired biochar
Table 7 summarizes the selected rules to obtain biochar with desirable attributes.In general, biochar with high yield, carbon content and HHV is preferred.Therefore, class 4 and class 5 rules were combined and summarized to show the range of biomass properties and pyrolysis conditions to obtain the biochar.Based on observation, it was difficult to produce biochar with class 5 yield, carbon content and HHV with the same type of biomass and pyrolysis conditions because of contradicting rules.For example, class 5 biochar yield requires a low temperature (T < 325) and a retention time (0.2 < RT < 0.6), but class 5 rules for biochar carbon content and HHV requires a high temperature (T > 525) and a retention time (RT > 0.8).However, overlapping of conditional attributes can be observed in class 4 rules between each decision attribute with ash content in between 11 and 14 wt%.VM between 60 and 62 wt%, C between 42 and 45.3 wt%, and pyrolysis temperature of 425°C-475°C, the biochar produced will a have high yield, carbon content and HHV.These results will help the selection of different types of feedstock in different ratios for co-pyrolysis to get the required properties of biochar.Yang et al. pyrolyzed 8 different biomass (Masson pine wood, Chinese fir wood, Chinese fir bark, bamboo leaves, bamboo sawdust, Miscanthus, pecan shells and rice straw) at 350°C and 500°C with a heating rate of 5°C/min with a residence time of more than 2 h.Fixed carbon for feedstocks was between 10 and 28 wt% and the volatile matter was in the range of 60-80 wt% (except pecan shells, which have VM near 50 wt%).All biochar produced had an HHV of 25 MJ/kg or higher.The results are very close to the conditions stated for HHV in Table 7 and obtained biochar with classes 4 and 5.Among these feedstocks, rice straw had ash content near 15 wt%, which provided a biochar yield of around 50 wt% [36].This result satisfies the conditions listed for biochar yield in class 4.

Conclusion
In this work, three RSML models were generated to predict the yield, carbon content and HHV of biochar based on pyrolysis conditions and biomass compositions.The results were validated and the rules, which showed high accuracy, high coverage and scientifically coherent, were chosen as the rules for the prediction model.The results from the biochar yield study showed that low temperature, low heating rate, low residence time and low biomass volatile matter provide a high biochar yield.Similarly, high biomass carbon content, high operating temperature, high heating rate and low residence time were predicted to have a high biochar carbon content and an increase in the volatile matter and fixed carbon of biomass increases the HHV of biochar.Based on the rules generated for these three decision attributes, the optimum biomass properties and pyrolysis conditions were determined for biochar with a high mass and energy yield.It was observed that with ash content between 11 and 14 wt %; VM between 60 and 62 wt%; C between 42 and 45.3 wt%, and a pyrolysis temperature of 425°C-475°C, the biochar produced will have a high yield, carbon content and HHV.After reviewing and comparing with other biochar-related studies and theories, the selected rules have high predictive power and are plausible from a scientific point of view.The inherent interpretability of RSML models is a distinct advantage over black-box ML tools in decision-support applications.The decision rules in the RSML approach identified the required biomass composition and pyrolysis conditions to generate biochar with a targeted range of yield, carbon content and HHV via the pyrolysis/co-pyrolysis process.There are two key directions for future work.First, alternative ML techniques, such as Bayesian networks or logical analysis of data, can also be applied to biochar property prediction.Second, ML models can be developed focusing on properties and soil interactions relevant to specific biochar applications (e.g.carbon sequestration).Current results help to select the combination of different types of feedstocks for biochar production in energy applications.
for class 3 (20 MJ/kg < HHV ≤ 25 MJ/kg), while rule 24 in reduct 4 showed AC in [0.86, 1.915) for class 4 (25 MJ/kg < HHV ≤ 30 MJ/kg), which means a high reduction in AC of biomass is required to improve the HHV of biochar from class 3-4.The rules based on the ultimate analysis of biomass showed an increasing trend of HHV with C, H and N.For example, in reduct 5, rule 7 to rule 9 and 21 the rules provide C < 35.4,C in [41.43, 42.075), C in [47.5, 49.185)) & (T ≥ 475) for class 2, 3 and 4 of HHV, respectively.

Figure 3 .
Figure 3. Validation results of rules from carbon content reduct 1.

Table 1 .
Classification of decision attributes.

Table 2 .
Example of an information table.

Table 7 .
Range of attributes for class 4 and class 5 biochar attributes.