Pierre L. Bhoorasingh, Belinda L. Slakman, Fariba Seyedzadeh Khanshan, Jason Y. Cain, Richard H. West.
Analysis for the manuscript with the above name.
%matplotlib inline
import numpy as np
from matplotlib import pyplot as plt
import pandas as pd
import seaborn as sns
import matplotlib
The first analysis uses kinetics generated for an earlier version of the manuscript, stored in kinetics.csv, and focuses on the H Abstraction reactions.
kinetics = pd.read_csv('kinetics.csv')
h_abs = kinetics[kinetics['family']=='H_Abstraction']
h_abs.head()
h_abs.plot.scatter('AutoTST_fwd', 'Rate Rules', loglog=True,
xlim=(1e-1,1e7), ylim=(1e-1,1e7), figsize=(6,6))
plt.gca().set_aspect('equal')
Note the significant banding, where many reactions are given the same rate estimate. Lets parse the "comment" that RMG puts on the reaction estimate, to see how each was estimated:
def getRMGSource(rule):
rule = rule.lower().strip()
if 'average' in rule:
return 'Average'
elif 'estimated' in rule:
return 'Other estimate'
elif 'exact' in rule:
return 'Exact match'
else:
return 'Unknown'
h_abs = h_abs.dropna(subset=['Rule summary'])
h_abs['RMGSource'] = h_abs.apply(lambda row: getRMGSource(row['Rule summary']), axis=1)
#sns.set_style('ticks')
fg = sns.FacetGrid(data=h_abs, hue='RMGSource', size=6, aspect=1.0, xlim=(1e-1,1e7), ylim=(1e-1,1e7))
ax = fg.map(plt.scatter, 'AutoTST_fwd', 'Rate Rules')
ax.set(xscale="log", yscale="log")
ax.axes[0][0].plot((0.1, 1e7), (0.1, 1e7), ls='--', label='Parity', color='gray')
plt.legend()
plt.show()
It looks like the "Other estimate" causes most of the banding, and causes Rate Rules to over-estimate the rate relative to AutoTST. Lets calculate some discrepancy histograms, showing how AutoTST varies with respect to the other source (Rate Rules or LLNL)
h_abs['DifferenceRMG'] = np.log(h_abs['AutoTST_fwd']/h_abs['Rate Rules'])
h_abs.head()
fg = sns.FacetGrid(data=h_abs, hue='RMGSource', size=6, hue_kws={'alpha': [1.0, 0.7, 0.5]})
ax = fg.map(plt.hist, 'DifferenceRMG', bins=np.arange(-10.5, 11.5)).add_legend()
"Exact match" matches closest (rate rule and AutoTST), and for "Other estimate" the AutoTST is biased to be slower than the Rate Rules (probably the Rate Rules estimate is on average too fast). Now compare AutoTST with the LLNL (Sarathy) model:
h_abs['DifferenceSarathy'] = np.log(h_abs['AutoTST_fwd']/h_abs['Sarathy'])
h_abs.head()
h_abs.hist('DifferenceSarathy', bins=np.arange(-10.5, 11.5))
A slight overall bias, with AutoTST usually thinking things should be slower than Sarathy thinks they should be. Compare that with a similar plot for AutoTST vs RMG rate rules:
h_abs.hist('DifferenceRMG', bins=np.arange(-10.5, 11.5))
Still shows that AutoTST thinks things are slower than RMG thinks. Slightly less of a bias, suggesting AutoTST is closer to RMG estimates than Sarathy's estimates? Let's compare RMG to Sarathy directly:
h_abs['DifferenceSarathyRMG'] = np.log(h_abs['Rate Rules']/h_abs['Sarathy'])
h_abs.hist('DifferenceSarathyRMG',bins=np.arange(-10.5, 11.5))
An updated version of RMG now calculates the nodal distance between the most specific node in the hierarchical decision tree, and the node that was actually used to provide the estimate. If the tree was full of data, this would be zero, but when you have to climb the tree looking for data, it is greater than zero, and you presumably have a less accurate rate (or at least a less precisely defined estimate).
However, the updated version of RMG also changed how the nodal discance information is used to decide which rules to average, to give (hopefully) better estimates. Therefore the "RMG rate rule" estimates have changed. We repeat with the new analysis, stored in the spreadsheet kineticsNEW.csv:
new_kinetics = pd.read_csv('kineticsNEW.csv')
new_kinetics.head()
# The 'AutoTST_rev' column contains some strings, so the numbers are also typed as strings by Pandas.
# Convert them to numbers when possible.
def float_if_possible(k):
"Like `float()` but returns the original input if it can't be converted."
try:
return float(k)
except:
return k
new_kinetics.ix[:,'AutoTST_rev'] = map(float_if_possible,new_kinetics.ix[:,'AutoTST_rev'])
new_kinetics.ix[8:10]
Let's see how many of the rate rule estimates have actually changed with the new version of RMG
print len(new_kinetics[new_kinetics['Rate Rules'] == new_kinetics['Rate Rules (New)']]), " the same"
print len(new_kinetics[new_kinetics['Rate Rules'] != new_kinetics['Rate Rules (New)']]), " have changed"
In RMG (and thus the CSV files) the units are combinations of [m,mol,s], so to plot in cm3/mol/s, as expected by most combustion chemists, we must multiply the rate coefficients for the bimolecular reactions (H_Abstraction and R_Addition_MultipleBond) by 106
# Hydrogen abstraction. All rates need converting.
h_abs = new_kinetics['family']=='H_Abstraction'
rate_columns = 'AutoTST_fwd', 'Sarathy', 'Rate Rules', 'Rate Rules (New)'
for col in rate_columns:
new_kinetics.loc[h_abs,col] *= 1e6
new_kinetics.loc[h_abs].head(3)
# Radical addition to a multiple bond.
# Only the values in the bimolecular direction need converting.
# For now, leave alone the Sarathy column.
r_addition = new_kinetics['family']=='R_Addition_MultipleBond'
rate_columns = 'AutoTST_fwd', 'Rate Rules', 'Rate Rules (New)'
for col in rate_columns:
new_kinetics.loc[r_addition,col] *= 1e6
new_kinetics.loc[r_addition].head(3)
# These rates Sarathy had in the recombination (bimolecular) direction too
# (the others he had as beta scission, and we leave them alone)
r_addition_sarathy = new_kinetics['AutoTST_rev']=='not reverse'
new_kinetics.loc[r_addition_sarathy,'Sarathy'] *= 1e6
new_kinetics.loc[r_addition_sarathy].head(2)
new_kinetics.head(10)
OK, we're ready for some plots. First set some style parameters.
plt.rc('font', size=16) # controls default text sizes
plt.rc('axes', titlesize=16) # fontsize of the axes title
plt.rc('axes', labelsize=16) # fontsize of the x and y labels
plt.rc('xtick', labelsize=16) # fontsize of the tick labels
plt.rc('ytick', labelsize=16) # fontsize of the tick labels
plt.rc('legend', fontsize=16) # legend fontsize
plt.rc('figure', titlesize=16) # fontsize of the figure title
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(15,5))
new_kinetics[new_kinetics['family'] == 'H_Abstraction'].plot.scatter(
'AutoTST_fwd', 'Rate Rules (New)',
c= 'Nodal distance', cmap = 'plasma',
loglog=True,
xlim=(1e5,1e15),
ylim=(1e5,1e15),
s=50, vmin=0, vmax=3.0,
edgecolor='black', linewidth=0.5,
ax=axes[0], colorbar=False,
title = u"Hydrogen abstraction\n$k$(1000K) / cm$^3$mol$^{-1}$s$^{-1}$")
new_kinetics[new_kinetics['family'] == 'R_Addition_MultipleBond'].plot.scatter(
'AutoTST_fwd', 'Rate Rules (New)',
c= 'Nodal distance', cmap = 'plasma',
loglog=True,
xlim=(1e3,1e14),
ylim=(1e3,1e14),
s=50, vmin=0, vmax=3.0,
edgecolor='black', linewidth=0.5,
ax=axes[1], colorbar=False,
title = u"R addition\n$k$(1000K) / cm$^3$mol$^{-1}$s$^{-1}$")
new_kinetics[new_kinetics['family'] == 'intra_H_migration'].plot.scatter(
'AutoTST_fwd', 'Rate Rules (New)',
c= 'Nodal distance', cmap = 'plasma',
loglog=True,
xlim=(1e-2,1e10),
ylim=(1e-2,1e10),
s=50, vmin=0, vmax=3.0,
edgecolor='black', linewidth=0.5,
ax=axes[2], colorbar=False,
title = u"Intramolecular H migration\n$k$(1000K) / s$^{-1}$")
fig.subplots_adjust(wspace=0.05, hspace=0.4, right=0.8)
im = plt.gca().get_children()[0]
cax = fig.add_axes([1.0,0.2,0.02,0.7])
fig.colorbar(im, cax=cax, label='Nodal distance')
for ax in axes:
ax.set(aspect='equal')
ax.set_xlabel(u'AutoTST')
ax.set_ylabel(u'RMG Rate Rules')
startx, endx = ax.get_xlim()
xticks = [10**i for i in np.arange(np.log10(startx), np.log10(endx)+1, 2)]
ax.set_xticks(xticks)
starty, endy = ax.get_ylim()
yticks = [10**i for i in np.arange(np.log10(starty), np.log10(endy)+1, 2)]
ax.set_yticks(yticks)
ax.plot(ax.get_xlim(),ax.get_xlim(), ls='--', color='gray', zorder=0)
plt.tight_layout()
fig.savefig('RateRulesvsAutoTST.pdf', bbox_inches='tight')
For the LLNL plots we need to deal with Radical Addition and Beta Scission reactions carefully. They're in the same Sarathy
column, but mean different things. The indication is whether the AutoTST_rev
column has "not reverse
" or a value.
r_add = new_kinetics[(new_kinetics['family'] == 'R_Addition_MultipleBond') &
(new_kinetics['AutoTST_rev'] == 'not reverse')]
beta_scission = new_kinetics[(new_kinetics['family'] == 'R_Addition_MultipleBond') &
(new_kinetics['AutoTST_rev'] != 'not reverse')]
print len(r_add), "radical addition to double bond"
print len(beta_scission), "beta scission"
# First, just do radical addition direction (ignore the beta scission)
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(15,5))
new_kinetics[new_kinetics['family'] == 'H_Abstraction'].plot.scatter(
'AutoTST_fwd', 'Sarathy',
loglog=True,
xlim=(1e5,1e14), ylim=(1e5,1e14),
s=50, edgecolor='black', linewidth=0.5,
ax=axes[0],
title="Hydrogen abstraction")
r_add.plot.scatter(
'AutoTST_fwd', 'Sarathy',
loglog=True,
xlim=(1e3,1e14), ylim=(1e3,1e14),
s=50, edgecolor='black', linewidth=0.5,
ax=axes[1],
title = r"R addition to multiple bond")
new_kinetics[new_kinetics['family'] == 'intra_H_migration'].plot.scatter(
'AutoTST_fwd', 'Sarathy',
loglog=True,
xlim=(1e-2,1e10), ylim=(1e-2,1e10),
s=50, edgecolor='black', linewidth=0.5,
ax=axes[2],
title = "Intramolecular H migration")
for ax, units in zip(axes, ['cm$^3$mol$^{-1}$s$^{-1}$']*2+['s$^{-1}$']):
x_limits = ax.get_xlim()
y_limits = ax.get_ylim()
limits = min(x_limits[0], y_limits[0]), max(x_limits[1], y_limits[1])
ax.plot(limits, limits, ls='--', color='gray', zorder=0)
ax.set_xlim(limits)
ax.set_ylim(limits)
ax.set(aspect='equal')
ax.set_xlabel(u'AutoTST')
ax.set_ylabel(u'LLNL Model')
ax.set_title( ax.title.get_text() + '\n$k$(1000K) / {}'.format(units) )
locator = matplotlib.ticker.LogLocator()
ax.xaxis.set_major_locator( locator )
ax.yaxis.set_major_locator( locator )
plt.tight_layout()
plt.show()
Now try just the beta scission reactions
plt.figure(figsize=(6,6))
ax = beta_scission.plot.scatter(
'AutoTST_rev', 'Sarathy',
loglog=True,
s=50, edgecolor='black', linewidth=0.5,
color='lightcoral',
title = r"$\beta$-scission")
ax.plot(ax.get_xlim(), ax.get_xlim(), ls='--', color='gray')
units = 's$^{-1}$'
ax.set(aspect='equal')
ax.set_xlabel(u'AutoTST')
ax.set_ylabel(u'LLNL Model')
ax.set_title( ax.title.get_text() + '\n$k$(1000K) / {}'.format(units) )
locator = matplotlib.ticker.LogLocator()
ax.xaxis.set_major_locator( locator )
ax.yaxis.set_major_locator( locator )
plt.show()
What's that outlier that LLNL has at 1e-6?
r = new_kinetics.loc[ new_kinetics['Sarathy']<1e-4 ]
k = r['SarathySource']
print k.values[0]
r
First, check the 10 atm parameters do give the 1e-6 number we are using and it's not an evaluation problem.
T = 1000.
A, n, Ea = (2.09e9, -3.550, 21.220)
k = A * T**n * np.exp(-Ea / (0.00198720359*T))
print "k =",k
print "log10(k) = ", np.log10(k)
So that's correct. According to some other chemkin files these parameters appear in, they came from this paper: Lee, J.; Bozzelli, J. W. Thermochemical and Kinetic Analysis of the Formyl Methyl Radical + O2 Reaction System. J. Phys. Chem. A 2003, 107 (19), 3778–3791 DOI: 10.1021/jp030001o.
We can confirm they match the 10 atm results from Table 8, Reaction 5.
However, the actual High Pressure Limit rate that went in to the QRRK calculation is provided in Table 7, Reaction 5. These parameters give:
# Actual High P limit that went in to the QRRK calculation, table 7, reaction 5
A, n, Ea = (1.47e10, 0.65, 23.82)
k = A * T**n * np.exp(-Ea / (0.00198720359*T))
print "k =",k
print "log10(k) = ", np.log10(k)
which is much closer, and would not appear an outlier.
We can update the table and re-plot it like so:
new_kinetics.loc[r.index,'Sarathy'] = k
new_kinetics.loc[r.index]
beta_scission = new_kinetics[(new_kinetics['family'] == 'R_Addition_MultipleBond') &
(new_kinetics['AutoTST_rev'] != 'not reverse')]
plt.figure(figsize=(6,6))
ax = beta_scission.plot.scatter(
'AutoTST_rev', 'Sarathy',
loglog=True,
s=50, edgecolor='black', linewidth=0.5,
color='lightcoral',
title = r"$\beta$-scission")
ax.plot(ax.get_xlim(), ax.get_xlim(), ls='--', color='gray')
ax.plot(new_kinetics.loc[r.index,'AutoTST_rev'], new_kinetics.loc[r.index,'Sarathy'], 'bo')
units = 's$^{-1}$'
ax.set(aspect='equal')
ax.set_xlabel(u'AutoTST')
ax.set_ylabel(u'LLNL Model')
ax.set_title( ax.title.get_text() + '\n$k$(1000K) / {}'.format(units) )
locator = matplotlib.ticker.LogLocator()
ax.xaxis.set_major_locator( locator )
ax.yaxis.set_major_locator( locator )
plt.show()
Now let's experiment with plotting both forward and reverse rates on the same plot. Also, let's check all the other PLOG rates look sensible, by marking them with a red outline.
r_add = new_kinetics[(new_kinetics['family'] == 'R_Addition_MultipleBond') &
(new_kinetics['AutoTST_rev'] == 'not reverse')]
beta_scission = new_kinetics[(new_kinetics['family'] == 'R_Addition_MultipleBond') &
(new_kinetics['AutoTST_rev'] != 'not reverse')]
plt.figure(figsize=(5,5))
ax = r_add.plot.scatter(
'AutoTST_fwd', 'Sarathy',
loglog=True,
s=50, edgecolor='black', linewidth=0.5,
label = r"R addition $k$(1000K) / cm$^3$mol$^{-1}$s$^{-1}$")
# red line around the PLOG rates
r_add[r_add['SarathySource'].str.contains('PLOG') ].plot.scatter(
'AutoTST_fwd', 'Sarathy',
loglog=True,
s=50, edgecolor='red', linewidth=0.5,
ax=ax,
)
beta_scission.plot.scatter(
'AutoTST_rev', 'Sarathy',
loglog=True,
s=50, edgecolor='black', linewidth=0.5,
color='yellow',
ax=ax,
label=r"$\beta$-scission $k$(1000K) / s$^{-1}$",
)
# red line around the PLOG rates
beta_scission[beta_scission['SarathySource'].str.contains('PLOG') ].plot.scatter(
'AutoTST_rev', 'Sarathy',
loglog=True,
s=50, edgecolor='red', linewidth=0.5,
color='yellow',
ax=ax, )
x_limits = ax.get_xlim()
ax.plot(x_limits, x_limits, ls='--', color='gray', zorder=0)
ax.set_xlim(x_limits)
ax.set_ylim(x_limits)
ax.set(aspect='equal')
ax.set_xlabel(u'AutoTST')
ax.set_ylabel(u'LLNL Model')
ax.legend(loc=3, bbox_to_anchor=(0., 1.02, 1., .102))
ax
plt.show()
OK, now we're ready to make the final figure for our manuscript.
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(15,5))
new_kinetics[new_kinetics['family'] == 'H_Abstraction'].plot.scatter(
'AutoTST_fwd', 'Sarathy',
loglog=True,
xlim=(1e5,1e15),
ylim=(1e5,1e15),
s=50, edgecolor='black', linewidth=0.5,
ax=axes[0],
title=u"Hydrogen abstraction\n(cm$^3$mol$^{-1}$s$^{-1}$)")
r_add.plot.scatter(
'AutoTST_fwd', 'Sarathy',
loglog=True,
xlim=(1e3,1e14),
ylim=(1e3,1e14),
s=50, edgecolor='black', linewidth=0.5,
ax=axes[1],
label = r"R addition (cm$^3$mol$^{-1}$s$^{-1}$)")
beta_scission.plot.scatter(
'AutoTST_rev', 'Sarathy',
loglog=True,
xlim=(1e3,1e14),
ylim=(1e3,1e14),
s=50, edgecolor='black', linewidth=0.5,
color='lightcoral',
ax=axes[1],
label=r"$\beta$-scission (s$^{-1}$)",
)
axes[1].legend(loc=3, bbox_to_anchor=(0., 1., 1., .1))
new_kinetics[new_kinetics['family'] == 'intra_H_migration'].plot.scatter(
'AutoTST_fwd', 'Sarathy',
loglog=True,
xlim=(1e-2,1e10), ylim=(1e-2,1e10),
s=50, edgecolor='black', linewidth=0.5,
ax=axes[2],
title = u"Intramolecular H migration\n(s$^{-1}$)")
for ax in axes:
x_limits = ax.get_xlim()
y_limits = ax.get_ylim()
limits = min(x_limits[0], y_limits[0]), max(x_limits[1], y_limits[1])
ax.plot(limits, limits, ls='--', color='gray', zorder=0)
ax.set_xlim(limits)
ax.set_ylim(limits)
ax.set(aspect='equal')
ax.set_xlabel(u'AutoTST')
ax.set_ylabel(u'LLNL Model')
locator = matplotlib.ticker.LogLocator()
ax.xaxis.set_major_locator( locator )
ax.yaxis.set_major_locator( locator )
startx, endx = ax.get_xlim()
xticks = [10**i for i in np.arange(np.log10(startx), np.log10(endx)+1, 2)]
ax.set_xticks(xticks)
starty, endy = ax.get_ylim()
yticks = [10**i for i in np.arange(np.log10(starty), np.log10(endy)+1, 2)]
ax.set_yticks(yticks)
plt.tight_layout()
fig.savefig('SarathyvsAutoTST.pdf', bbox_inches='tight')
This is to see how many times the same rate is used for different reactions within the Sarathy model
print "Radical addition"
r_add.groupby('Sarathy').count().sort_values(by='number', ascending=False)['number'].head()
print "Hydrogen abstraction"
new_kinetics[new_kinetics['family']=='H_Abstraction'].groupby(['Sarathy']).count().sort_values(by='number', ascending=False)['number'].head()
print "Intramolecular H migration"
new_kinetics[new_kinetics['family']=='intra_H_migration'].groupby(['Sarathy']).count().sort_values(by='number', ascending=False)['number'].head()
Now let's examine clustering in the RMG rate rule estimates
print "Radical addition"
new_kinetics[new_kinetics['family']=='R_Addition_MultipleBond'].groupby(['Rate Rules (New)']).count().sort_values(by='number', ascending=False)['number'].head()
print "Hydrogen abstraction"
new_kinetics[new_kinetics['family']=='H_Abstraction'].groupby(['Rate Rules (New)']).count().sort_values(by='number', ascending=False)['number'].head()
print "Intramolecular H migration"
new_kinetics[new_kinetics['family']=='intra_H_migration'].groupby(['Rate Rules (New)']).count().sort_values(by='number', ascending=False)['number'].head()
Let's see how common each nodal distance is:
groups = new_kinetics.groupby('Nodal distance').count()['number']
groups
Finally, let's see how much the rate rule estimates have changed between the earlier version of RMG and the newer
#sns.set_style('white')
fg = sns.FacetGrid(data=new_kinetics, hue='family', size=8, aspect=1.0, )
ax = fg.map(plt.scatter, 'Rate Rules', 'Rate Rules (New)')
ax.set(xscale="log", yscale="log")
plt.legend()
new_kinetics['Rate Rule Change'] = np.log(new_kinetics['Rate Rules (New)']/new_kinetics['Rate Rules'])
new_kinetics.hist('Rate Rule Change', bins=np.arange(-10.5, 11.5))