PAH and IUPAC Nomenclature

The nomenclature of polycyclic aromatic hydrocarbons (PAH) and their derivatives has undergone substantial changes since the beginning of the 20th century. The International Union of Applied and Pure Chemistry (IUPAC) has issued rules and recommendations on chemical nomenclature including organic compounds like PAH since 1957. This article presents an overview of the latest version of IUPAC nomenclature for PAH and their derivatives, detailing current changes. In addition, an overview of older nomenclature systems and commonly used, PAH specific terms aiding nomenclature is given.


INTRODUCTION
The nomenclature of chemical substances serves as a common denominator between researchers, providing a communication basis and facilitating the exchange of knowledge. As the field of PAH has developed, PAH nomenclature has evolved and changed significantly over time. The International Union of Applied and Pure Chemistry (IUPAC) nomenclature (1-2) is the most commonly recommended nomenclature, which may or may not be considered the best choice for PAH. Especially for PAH with more than seven rings, the IUPAC nomenclature can become quite burdensome-the handy semi-trivial name circobiphenyl changes to the rather lengthy IUPAC name of naphtho[2 , 1 ,8 ,7 :4,10,5]anthra [1,9,8-abcd]coronene 4. Due to the long history of PAH research and the convenience of non-IUPAC nomenclature, investigators are well advised to be familiar with both the old trivial names and older nomenclature systems as well as IUPAC nomenclature. Apart from these classical word-based nomenclatures, there have been purely mathematical approaches, delineating structures to a mere arrangement of geometrical shapes (3)(4)(5)(6)(7)(8)(9). These geometric descriptions often lack the chemical practicality, but allow for better mathematical characterization and computer processing. In the following review, the application of the IUPAC nomenclature for PAH and substituted polycyclic aromatic compounds (PAC) and some basic information about older systems and several PAH specific terms, which are commonly found in literature, are presented.

IUPAC NOMENCLATURE
IUPAC has addressed chemical nomenclature since 1957 (2). Naturally, a set of rules exists for the nomenclature of PAH. The current, full set of IU-PAC rules can be accessed online (1,(10)(11)(12)(13). Condensed, more practical versions can be found in literature (14)(15)(16). In 2013, IUPAC has published the newest addition to its "Blue Book"-the "IUPAC Nomenclature of Organic Compounds"-redefining some parameters relevant for PAH nomenclature (1). The new term of a "preferred IUPAC name" (PIN) and the acceptance of equivalent unambiguous names is a significant step away from the rather strict recommendation and application of previous IUPAC rules.
The IUPAC nomenclature uses trivial and semi-trivial name systems for naming a set of parent compounds. The naming strategy for a PAH follows several steps: The first step is to find within the molecule the polycyclic substructure which corresponds to the highest priority (usually largest) parent compound. Figure S1 and Table T1 in the supplemental material give the set of parent compounds in order of increasing priority. The next step requires determining the number and type of ring systems fused to this parent structure. Preference is hereby given to the most simple priority substructures by choosing the maximum number of first-order attachments and the maximum number of identical attached components. The name is formed by arranging prefixes of the fused ring systems in alphabetic order before the parent compounds name. If multiple identical ring substituents are added they are labeled with "di", "tri", "tetra", etc. before the substituent name to indicate their multiplicity. The fused ring systems prefixes are listed in Table S1. The previous practice of omitting a vowel at the end of a prefix if the parent compound also starts with a vowel has been rendered obsolete (17) in IUPAC nomenclature (previously: benz[a]ovalene now: benzo[a]ovalene).
To describe the correct orientation and connectivity of the ring substituent to the parent structure, the locants of the connecting atoms of the ring substituent and the bonds of the parent PAH in italic letters are placed separated by a hyphen in brackets after the substituents prefix. For the numbering of the fused ring systems, the labels of the parent compound are relevant and can be taken directly from Figure S1. The labels for peripheral bonds of the parent compound can also be obtained from Figure S1. Starting with the bond between the peripheral atom with locant 1 and 2, Roman alphabetical letters are given in order for all peripheral bonds following the locant numbering. The bond labels of the fused bonds of the parent component are chosen in such a manner that the lowest possible (alphabetical order) ordering is achieved. The locant numbering of the ring substituent is also chosen to yield the lowest numbers possible. Locant numbers designating the orientation of fused benzo, cyclopropa and cyclopenta (fused with two bonds) ring systems are not required due to symmetry. For ring substituents fused to another ring substituent (secondary substituent) the locants of each of the ring systems separated by a colon are used to detail the connectivity. Locants of secondary substituents and locants of identical substituents are marked with apostrophes. Figure 1 [1,9,8abcd]coronene 4. Figure 1a shows the breakdown of tetrabenzo[a,cd,f ,lm]perylene 1 into its parent compound perylene and four fused benzene rings. Locants for benzene are omitted and only the bond label of the parent compound are necessary to complete the name. Figure 1b Figure 1c requires the use of a colon and apostrophes to distinguish between the locants of each naphtho substituent. The large size of naphtho[2 , 1 ,8 ,7 :4,10,5]anthra [1,9,8-abcd]coronene 4 in Figure 1d mandates the use of naphthalene as a secondary substituent, which is bonded to the primary anthracene substituent. By utilizing colon, apostrophes, but only locant numbers to designate the connectivity between the two substituents the differentiation to the previous case with identical primary substituents is clear. The naming outlines presented here comprises just the most common principles. IUPAC nomenclature features an abundance of further detailed rules, attempting to address any eventuality.
The description of PAH derivatives as they are generated via environmental or metabolic degradation is also of importance, especially as the carcinogenicity of PAH is often due to the oxidized metabolic transformation products (18)(19)(20). Substituted PAH are fairly simple to describe if the substituent does not change the PAH structure. Alkyl, hydroxy, halo, and nitro substitution do not change the polycyclic structure. The naming here follows standard IUPAC substitution rules, where the locant of the peripheral carbon atom connects to the substituent and the appropriate prefix for the functional group is added in front of the name, e.g., 1-hydroxy-naphthalene, 1-nitro-pyrene, etc.
In order to obtain the proper locant number for any fused PAH, the molecule structure needs to align its longest ortho-condensed chain with the horizontal axis while most other rings need to lie right of the vertical middle line and above the horizontal line in the right, upper quadrant. For this alignment cyclopenta rings are not drawn as symmetric pentagons but as cutoff hexagon, as is shown in Figure S1. The proper orientation for all parent compounds (except helicenes) can be seen in Figure S1 in the supplemental material. The properly oriented PAH is assigned locants starting at the uppermost ring of the parent compound. Numbering starts with the non fused carbon atom most counterclockwise on the uppermost, most right ring and numbering continues in clockwise direction. If there is more than one uppermost ring the enumeration starts with the one farthest to the right. Fused atoms are labeled with the locant of the previous peripheral atom and a letter in alphabetical order. Due to the historic development of the nomenclature, exceptions for these locant assignments can be found in phenanthrene and anthracene.
With the latest edition of the IUPAC Blue Book in 2013 (1) there has been a major change for the labeling of internal carbon atoms. Instead of previously used locant numbers with letters, a new system utilizing the closest fused carbon locant and the number of bonds necessary to reach the position in superscript is used. Preference is given first to the minimum number of bonds and the lowest possible locant. Rubicene S37, coronene S39, pyranthrene S45, and ovalene S55 in Figure S1 in the supplemental material demonstrate this new labeling system.
Substituents which change the electronic configuration of the PAH require a slightly different approach, depending on the type of substituent. Although not necessarily a substituted PAH, partially saturated PAH are described in similar manner. Positions of saturation are denominated by using the locant and an italicized "H" separated with a hyphen in front of the PAH name. This numbering can be omitted for maximum unsaturated compounds for which this saturation occurs naturally in a unique position. Therefore, fluorene unequivocally describes 9H-fluorene 19 and can be used synonymously. IUPAC allows this simplification also for indene S7 (1H-indene) and phenalene 5 (1Hphenalene). Compounds which deviate further from full unsaturation (e.g., due to hydrogenation) are described by utilizing the prefix hydro. Therefore, there is a clear distinction between the fully unsaturated 1H-phenalene 5 (or just phenalene) and the saturated 2,3-dihydro-1H-phenalene 6 shown in Figure 2. If the saturation occurs due to the addition of another functionality, e.g., an oxo group, the nomenclature using indicated hydrogens is preferred to the hydro prefix. 1H,3H-phenalene-2-one 7 derived from 2,3-dihydro-1H-phenalene 6 is shown as an example in Figure 2.
A common environmental and metabolic transformation is the oxidation to ketones and quinones. If the PAH structure is already unsaturated and a methylene group is bound to an oxo group, the compounds are named as cyclic ketones. Figure 3 shows some examples of PAC with oxo groups. 1H-phenalen-1-one 8, 9H-fluoren-9-one 9, and 7H-benzo[de]anthracene-7-one 10 are probably the most common oxo PACs. For fluorenone 9 and phenalenone 8 the numbering and the H can be omitted as the IUPAC accepted trivial name specifies the position of the non-aromatic ring. Older names, such as perinaphthenone for 8 and benzanthrone or naphthanthrone for 10 should not be used as they do not allow for a proper systematic description.
PACs where an even number of oxo-groups are found are commonly described as quinones to highlight their specific chemistry. IUPAC preferred nomenclature describes quinones as ketones by denoting the multiplicity of the oxo group. If a functional group with higher priority than the ketone group is present (e.g., carboxylic acid) the oxygen of the keto group is described as oxo-substituent. Acenaphthylene-1,2-dione 11, naphthalene-1,4-dione 12, anthracene-9,10-dione 13, and phenanthrene-9-10-dione 14 are among the most common OPAC and their molecular structures are shown in Figure 3 as an example. Benzo[pqr]tetraphene-1,6-dione 15 is an example for a fused PAH dione. Older unambiguous descriptions such as quinones can be used, but are not part of the preferred IUPAC nomenclature.

NON-IUPAC NOMENCLATURE
IUPAC bases its parent compounds' names partially on older systematic trivial names. The "polyacenes", "polyaphenes", "polyphenylenes", and "polynaphthylenes" are names of parent structures originating from this principle. There are several other semi-trivial name systems. Zethrenes have been named for their central Z-shaped formally isolated double bonds (21). "Terrylenes" are peri-fused naphthalene units with themselves. "Anthrenes" are found in the IUPAC nomenclature in pyranthrene S45 and phenanthrene S17; however, they are not susceptible to a systematic rule as phenanthrene, benzanthrene, antanthrene, violanthrene, isoviolanthrene do not have much in common.
One set of systematic trivial names that is not IUPAC preferred, but is quite useful, is the "circo" or "circum" system. This system applies to particular large PAH, where a parent molecule in the center is surrounded by a closed chain of ortho-fused benzene rings. Naphtho[2',1',8',7':4,10,5]anthra [1,9,8abcd]coronene 4 in Figure 1 would therefore be called circobiphenyl (circumbiphenyl) and coronene S39 in Figure S1 in the supplemental material could be named circobenzene (circumbenzene). The current IUPAC nomenclature has developed from older nomenclature systems. A substitution nomenclature based solely on locants was initially suggested by Bally and Scholl (22) in 1911 and had been in use for more than fifty years. This system laid the foundation for the IUPAC system by using a trivial name for the parent PAH (the largest PAH fragment within the structure) and describing benzene and polycyclic ring systems fused to this parent compound similar to the fusing of ring systems onto other ring systems in the IUPAC system. Unfortunately, this system did not require a preferred orientation of the PAH and allowed therefore for rather arbitrary numbering. Scholl's system failed to provide unambiguous names when other ring substituent than benzene are fused to the parent PAH (23,24). Patterson refined the system addressing the orientation of the parent PAH and the numbering of the peripheral atoms, allowing for non-ambiguous name assignment (25). The orientation rules of Patterson are incorporated with the current IUPAC system and are basically identical. Even though this system provided superior naming capability compared to the basic approach of Bally and Scholl, the chemical community did not quickly accept this system and as such we can find now multiple names for single compounds. PAHs, which fall victim to this babylonian conundrum are pyrene 23 and its benzologues and derivatives. Figure 4 shows the different locant numbering for pyrene. Benzo[pqr]tetraphene 29, still better known as benzo[a]pyrene (and still named as such in the CAS) was designated as 3.4-benzopyrene in the oldest nomenclature, whereas the refinement of Patterson changed it to 1.2-benzopyrene (25). These naming multiplicities are clearly an issue and have to be taken into consideration especially for literature searches and reviews. Naturally, the multiple name issue extends to substituted pyrenes.
The 16 EPA priority PAHs are not spared the issue of changing nomenclature. Table 1 and Figure 5 present the PAH as they were defined in the Clean Water Act (26) as well as their current IUPAC preferred names and other utilized names. Four out of the 16  IUPAC nomenclature has been largely ignored as witnessed by tetraphene that was introduced as a parent structure 20 years ago (10) but is barely found as such in literature. With the recent redirection of IUPAC to preferred names instead of "only IUPAC" names it is quite relieving to realize that the uniqueness of a name such as benzo[a]pyrene is sufficient to conform to IUPAC standards. Thankfully, the Clean Water Act specified the CAS numbers for each PAC Figure 5: 16 PAH defined in the Clean Water Act as priority pollutants (names given in Table 1).
avoiding legal issues with changing nomenclature. In general it would seem beneficial to provide the CAS numbers for the major PACs in any publication. The CAS number allows an unambiguous assignment and facilitates searches. Unfortunately, this approach is rarely followed. Therefore researchers dealing with PACs have to be aware of the potential surprises the ever changing landscape of PAC nomenclature holds. Computer-based tools for nomenclature are also available. Computer programs such as ChemDraw (PerkinElmer, Waltham MA, USA) or Chemsketch (ACDlabs, Toronto, CAN), which can derive names from drawn structures or structures from given names, are not yet capable of encompassing the intricate complications of PAH nomenclature and are usually limited to compounds with less than three rings. However, if they act as graphical input interface for databases they can be of invaluable help for determining structures and names. Scifinder (Chemical Abstract Service, American Chemical Society, USA) provides a convenient access interface to link user-drawn graphical structures with the CAS database, one of the most comprehensive chemical databases (29). Reaxsys (Elsevier, Amsterdam, NL) is equally capable of handling the search of complex formulae in the Beilstein database (30).

PAC SPECIFIC TERMS
As PAH nomenclature has its roots in trivial names it is not surprising to find that quite often additional vocabulary is necessary to understand PAH research and literature. Several of these terms have co-developed with the nomenclature and were either derived from the nomenclature or have substantially affected the same. In the following, several common terms to describe PAH are presented.
Ortho-fusion designates the fusion of benzene rings with only a single bond. The series naphthalene 16, anthracene 21 and tetracene S26 would describe ortho-fused PAH which are fused in linear fashion at the 1,2 and 4,5 position of the benzene rings. Higher benzologues of the linearly ortho-fused PAH are usually designated as the polyacene series, i.e., continue with numeric prefix as hexacene S35, heptacene S42, and so on (31). The series phenanthrene 20, tetraphene 25, pentaphene S30, etc., also includes only ortho-fused rings, but with one (central) ring having the ortho-fusion on the 1,2 and 3,4 position of the benzene ring forming the polyphenes (31). In older literature ortho-fusion is called cata-fusion.
Peri-fusion describes attachment connected by at least two neighboring bonds. The simplest case is phenalene (perinaphthene) 5, where a sixmembered ring fuses to naphthalene in the 1, 8 and 8a position with two bonds. The term "peri" can be encountered in several trivial names such as peropyrene (dibenzo[cd,lm]perylene) or periflanthene (diindeno[1,2,3-cd:1 ,2 ,3lm]perylene). Interestingly enough the terms "ortho" (cata) or "peri" did not find their way into the newer field of graphene research, where the rather picturesque names of "zig-zag" and "arm-chair" instead of peri and cata-are used to describe the graphene edges.
Benzenoid PAHs are, as the name implicates, PAHs solely built up by benzene rings. As such any PAH containing a 5-membered ring such as fluoranthene is not benzenoid.
The term "fully benzenoid PAH" would seem like a redundant term but it designates benzenoid PAH which when drawn in their Kekulé structure only possess rings with either three "double" bonds or no double bonds such as triphenylene S22 in Figure S1 in the supplemental material. These PAH exhibit higher stability than their "only" benzenoid isomers (23). When drawing the structural formula with a "phenyl-ring" instead of the three double bonds, this special electronic configuration becomes easily visible. An alternative, more descriptive term for these systems might be the "total resonant sextet benzenoids" as it was coined by Dias (32).
Another classification of PAH is the distinction between alternant and non alternant PAH. Alternant PAHs allow the placement of a mark (usually a star) on every other atom, i.e., always in the meta-position. Figure 6 demonstrates this classification for the alternant pyrene 23 and the non-alternant fluoranthene 22. Symmetry demands that benzenoid PAHs are alternant, whereas cyclopentafused PAHs are non-alternant, i.e. two marks will be in the ortho-position. This description correlates with electronic properties of the molecule and finds application in fluorescence spectroscopy. Alternant PAHs were found to have their fluorescence quenched by nitromethane, whereas nonalternant PAHs do not experience significant quenching (33,34).
PAHs are also often described topographically. Planar PAHs are flat and do not exhibit sterical strain. Non-planar PAHs are common and are the result of "overcrowding" in certain regions of the molecule, resulting in sterical strain and causing the molecule to bend out of plane. These regions which can cause overcrowding (also dependent on the substituent in the regions) are appropriately called, "bay", "cove", and "fjord" region (14,35). Figure 7 shows these three regions. Bay regions normally do not exhibit relevant strain on the polycyclic structure, unless a substituent other than hydrogen is found within the bay region. Cove and fjord regions cause substantial strain on the molecule even with hydrogen as the only substituent and yield non-planar molecules. Increasing the number of rings to six rings around a fjord cavity results in tremendous strain, causing the molecule to take on a helix like configuration. This configuration gives rise to the names of the "helicenes" which even exhibit chiral character (right turning and left turning spirals). Unfortunately, the designation of these regions is not consistent throughout literature and especially literature discussing carcinogenicity of PAC regards the cove region in Figure 7 as a fjord region. Hence, the choice of the used classification should be made clear in each publication to ensure transparency.
K-region and L-region are descriptors often used when describing carcinogenic properties of PAC. These terms were initially introduced by Pullman et al. (36,37) to describe the regions which bear the reactivity of carcinogenic compounds. Both regions can be found by their bond delocalization, their carbon delocalization and para localization energies (37). Commonly the K-region describes a phenanthrene-like exposed bond such as the f -bond in tetraphene 25, whose substitution changes the reactivity of the PAH and ultimately its carcinogenicity. The L-region denotes the 7 and 12 positions in tetraphene 25, which upon substitution also affects the carcinogenic potential. Pullman also suggested an additional M-region, which denotes the positions reactive in metabolic perhydroxylation (37). This term has been used to a much lesser degree than K-and L-region.

DISCUSSION
PAH and PAC nomenclature is as facetted and colorful as the substances themselves and multiple systems and approaches can be utilized. IUPAC nomenclature has the benefit of a systematic approach, but generates sometimes quite complicated names. In addition, it requires patience as its transgression into science takes time. It is truly remarkable that IUPAC steered away from its strict application of rules to the aforementioned preferred nomenclature terminology, allowing unambiguous older names to coexist. This newly found flexibility is quite appreciated as certain names such as benzo[a]pyrene have historically grown and would pose a serious challenge to change. On the other hand, it might be also the downfall of the IUPAC nomenclature as suggested changes without mandatory requirements will take even longer before being accepted. Therefore it might be recommendable practice to at least list the current PIN and the CAS number once in a publication even if non-preferred nomenclature is used.
IUPAC nomenclature suffers from rather elaborate name constructs for large, peri-condensed PAH. Structures which would aid the nomenclature of those large PAH are lacking, as the IUPAC parent structures are particular rich in ortho-condensed and not many peri-condensed systems. It might be a worthwhile consideration for future rule changes to include more peri-condensed systems such as anthanthrene (PIN: naphtho[7,8,1,2,3nopqr]tetraphene), bisanthene (PIN: phenanthro [1,10,9,8-fghij]perylene), or to incorporate the circo-based semi-trivial names to enhance IUPAC nomenclature's capability to address larger PAH in a simple manner.
At the same time any (re-)introduction of trivial names needs to be considered carefully. The case of "olympicene", which garnered media attention since 2011 can serve as showcase against new trivial names. The attempted synthesis of 2H-benzo[cd]pyrene and the atomic photograph of its tautomer of 6H-benzo[cd]pyrene triggered substantial media attention (for a chemical science achievement) as its structural formula resembles the olympic rings. The authors of the original work stated the correct IUPAC derived names and established the name "olympicene" in a blog to promote chemistry to society (38)(39)(40). Unfortunately, "olympicene" can now be found in scientific publications (41)(42)(43), contributing another trivial name to the already challenging nomenclature of PAH. As burdensome as IUPAC nomenclature might seem for some PAHs, it is certainly welcomed as a systematic approach facilitating the lowering of nomenclature (language) barriers for current and future scientists instead of erecting a shroud of alchemic mystery and sensation.
In the future, digital media will continue to play an important role and change the way knowledge is accessed. SciFinder was a revelation in the 1990's, internet search engines such as Google and the collaborative community encyclopedia Wikipedia (and others) have influenced the turn of the century; it will be fascinating to see what is coming next. Due to the continuing digitalization and also the fruitful work of many scientists, proper data management will become quite essential. Mathematical structural description, while not without disadvantages, might bear an edge in the future and find more application.
Keeping in mind that these mathematical structures are even more abstract than structural formulas or word-based names, it seems to be a correct decision of IUPAC to allow established, unambiguous names to co-exist with the PIN. Especially for PAH names which have established themselves beyond the scientific community in society such as benzo[a]pyrene, it will ease the generational move to the newer IUPAC name (benzo[pqr]tetraphene) while providing continuity in knowledge.

CONCLUSION
IUPAC nomenclature is well adapted for naming PAHs, but newer recommendations and guidelines have been only moderately successful in competing with established, equally well-defined names. By officially accepting the co-existence of other names it seems IUPAC has come to terms with the multiplicity of names. The other major change addressing the numbering of internal carbon atoms will not affect the majority of the PAH research community substantially. However, the acceptance of name multiplicity will continue to challenge the PAH research community as IUPAC nomenclature continues to develop, while older nomenclature is still in use and accepted. The adaption of IUPAC recommendations will therefore depend largely on the voluntary acceptance by the individual researcher and the text books and manuscripts of current and future generations.