Supporting Information for "A Fragment Library of Natural Products and its Comparative Chemoinformatic Characterization"
COCONUT_Fragments.sdf, ChEMBL_Fragments.csv and REAL_Fragments.csv contain the structures generated from the respective compound data sets. All files include the following information for each fragment: identification number (ID), source collection (Data Set), simplified molecular input line entry system (Fragment), belonging to one (Unique) or the three data sets (Overlapped), number of compounds containing that fragment in the data set (Counts) and fraction of them (Proportion), fraction of sp3 carbons (FractionCSP3), fraction of chiral carbons (FractionCC), number of heavy atoms (NumHeavyAtoms), number of oxygen atoms (NumO), number of nitrogen atoms (NumN), number of bridgehead atoms (NumBridgeHead), number of spiro atoms (NumSpiro), number of rings (NumRings), number of aromatic rings (NumArRings), number of aliphatic rings (NumAlRings), number of heterocycles (NumHet), number of aromatic heterocycles (NumArHet) and number of aliphatic heterocycles (NumAlHet).