Accuracy of professional judgments for dermal exposure assessment using deterministic models

Abstract The accuracy of exposure judgments, particularly for scenarios where only qualitative information is available or a systematic approach is not used, has been evaluated and shown to have a relatively low level of accuracy. This is particularly true for dermal exposures, where less information is generally available compared to inhalation exposures. Relatively few quantitative validation efforts have been performed for scenarios where dermal exposures are of interest. In this study, a series of dermal exposure judgments were collected from 90 volunteer U.S. occupational health practitioners in a workshop format to assess the accuracy of their judgments for three specific scenarios. Accuracy was defined as the ability of the participants to identify the correct reference exposure category, as defined by the quantitative exposure banding categories utilized by the American Industrial Hygiene Association (AIHA®). The participants received progressively additional information and training regarding dermal exposure assessments and scenario-specific information during the workshop, and the relative accuracy of their category judgments over time was compared. The results of the study indicated that despite substantial education and training in exposure assessment generally, the practitioners had very little experience in performing dermal exposure assessments and a low level of comfort in performing these assessments. Further, contrary to studies of practitioners performing inhalation exposure assessments demonstrating a trend toward underestimating exposures, participants in this study consistently overestimated the potential for dermal exposure without quantitative data specific to the scenario of interest. Finally, it was found that participants were able to identify the reference or “true” category of dermal exposure acceptability when provided with relevant, scenario-specific dermal and/or surface-loading data for use in the assessment process. These results support the need for additional training and education of practitioners in performing dermal exposure assessments. A closer analysis of default loading values used in dermal exposure assessments to evaluate their accuracy relative to real-world or measured dermal loading values, along with consistent improvements in current dermal models, is also needed.


Introduction
Assessing the accuracy and consistency of exposure assessment approaches for human health risk assessment is critically important for ensuring outcomes are correct. Recent efforts have highlighted some of the important determinants of judgment accuracy and have evaluated or identified methods of improving this accuracy (Logan et al. 2009(Logan et al. , 2011Vadali et al. 2012b;Banerjee et al. 2014;Arnold et al. 2016). As noted by Arnold et al. (2016), a majority of exposure assessments are performed without the benefit of a quantitative exposure dataset specific to the scenario of interest. Rather, semi-quantitative or qualitative approaches are often employed to reach conclusions about the acceptability of exposures. These authors also noted that qualitative judgment accuracy is often low, and there is a propensity for practitioners to underestimate actual exposure potential for inhalation exposure pathways when using qualitative or semiquantitative unstructured approaches without additional information or training. It was further noted that tools or frameworks, such as checklists, to identify the most important factors or rules when performing a qualitative exposure assessment have shown an ability to reduce biases and improve judgment accuracy (by a factor of 2). Additionally, newer practitioners were able to reach the same level of accuracy as experienced practitioners when using an exposure assessment tool or tools with the appropriate critical inputs (Arnold et al. 2016). The quality of exposure judgments can have several consequences. Many epidemiology studies rely heavily on judgments of exposure made using a variety of techniques ranging from qualitative to quantitative in nature, although many are purely qualitative in their approach. Inaccurate exposure judgments, leading to exposure misclassification, can have a significant influence on the results of these studies (Sahmel et al. 2010;Sakhvidi et al. 2015). Further, there can be important risk consequences for workers, who may be either under-protected, leading to an increased risk of adverse health outcomes, or over-protected, resulting in misinformed conclusions about worker risk and potentially unnecessary controls and related costs. As a result, understanding the weaknesses in the approaches employed and identifying methods to improve the outcomes of these assessments are both extremely important.
For dermal exposure assessments specifically, several tools or approaches have been proposed to help practitioners make objective decisions about dermal exposure and risk, including the assessment of dermal uptake potential based on physicochemical properties Warren et al. 2003;Magnusson et al. 2004;Sahmel et al. 2009;Tibaldi et al. 2014;Goede et al. 2019;McNally et al. 2019). Examples of semi-quantitative and quantitative dermal exposure assessment models in current use include the RISKOFDERM model Warren et al. 2006); the DeRmal Exposure Assessment Methodology or DREAM (van Wendel de Joode et al. 2003, 2005b, 2005c; the AIHA conceptual dermal exposure assessment model Sahmel and Boeniger 2015); the IH SkinPerm model (Tibaldi et al. 2014), and the ECETOC Targeted Risk Assessment (TRA) Model (ECETOC 2004(ECETOC , 2009a(ECETOC , 2009b. Additionally, the Advanced REACh Tool (ART) has a dermal component that will be known as Dermal ART or dART and which is currently in the validation stage Gorman Ng et al. 2012;Goede et al. 2019;McNally et al. 2019). Blanco et al. (2008) developed a qualitative dermal exposure assessment checklist method called the Determinants of Dermal Exposure Ranking Method (DERM) for performing expert ratings of pesticide exposure potential in developing countries. The researchers reported that the checklist allowed subsistence farmers to correctly identify the determinants of dermal exposure based on the key transport mechanisms found in the Schneider et al. (1999) dermal conceptual model. The authors of Blanco et al. (2008) also noted that the specificity of the checklist and rating tool to the particular pesticide application scenarios being evaluated likely improved the performance of the tool (compared to the measured reliability of other more broad or general dermal models such as RISKOFDERM and DREAM) (Blanco et al. 2008).
It has been acknowledged that gaps in accuracy and consistency remain for both qualitative and quantitative approaches to dermal exposure assessment (Marquart et al. 2001Warren et al. 2003;van Wendel de Joode et al. 2005b;Marquart et al. 2017). The authors of a recent external evaluation of the models used in support of the REACH legislation reported that the small number of dermal exposure estimates compared to inhalation exposure estimates ultimately prevented the researchers from including the dermal exposures in the validation process (Tischer et al. 2017). It has further been noted that the results of comparative analyses of dermal exposures using different tools, such as Tier 1 screening models under REACH, cannot always be directly compared since the outputs of the tools are often reported in different units ). Research has indicated that the ECETOC TRA model is likely to overestimate or substantially overestimate dermal exposures, particularly at exposures that are considered low, and may underestimate dermal exposures at high levels of exposure (Marquart et al. 2017). Further, comparative results for dermal exposures using the RISKOFDERM toolkit were higher relative to other dermal models ). The need for robust validation of the new dermal component of the Advanced Reach Tool (dermal ART or dART) before completion has also been noted . Calibration analyses of this new dART model have noted that "inputs to the model are based upon user judgement, [and] in practical use, the reliability of predictions will be dependent upon both the competence of users and the quality of contextual information available on an exposure scenario" (McNally et al. 2019, 650). Given the other recent publications noting concerns addressing the level of accuracy of user judgments, additional analysis of the quality of these dermal exposure judgments may be warranted (Arnold et al. 2016).

Dermal exposure judgment categories
In order to make decisions about the acceptability or accuracy of dermal exposures, criteria must be established to make these determinations. Both qualitative and quantitative criteria have been established (for example, by the AIHA dermal exposure assessment strategy), to rate whether the determinants of dermal exposure assessment are unexpected or unlikely, small/improbable, midsize or possible, or large/probable . Some common factors considered in dermal exposure assessments include the surface area of contact in square centimeters (cm 2 ); the quantity of a mixture or contaminant deposited on the skin in milligrams per square centimeter per time (hours) or per event (mg/cm 2 -h), or sometimes in the concentration of mg/cm 3 ; the contact frequency in number of events per day or the percentage of the work day for which the event occurs; and the amount of the applied dose absorbed through the skin in mg/cm 2 -h, or the fraction absorbed using a unitless factor if quantitative absorption data or estimates are not available (Sahmel et al. 2009). These factors can be used to calculate or estimate a semi-quantitative or quantitative dose in mg/day.
Default values are available for some of these inputs from such sources as the U.S. EPA's Exposure Factors Handbook and other publications and resources Kromhout and Vermeulen 2001;McDougal and Boeniger 2002;Vermeulen et al. 2002;Cherrie et al. 2006;Stefaniak and Harvey 2006;Hubal et al. 2008;Sahmel et al. 2009;Fransman et al. 2011;USEPA 2011USEPA , 2012Williams et al. 2011;Frasch et al. 2014;Stefaniak et al. 2014;Goede et al. 2018). Similar types of dermal exposure determinants have been used and described in other dermal exposure assessment models (van Wendel de Joode et al. 2003;Warren et al. 2003;Tibaldi et al. 2014;Goede et al. 2019). Some of the limitations and concerns inherent in these determinants have been previously discussed and should be taken into account when performing dermal exposure modeling Sahmel et al. 2009;Frasch et al. 2014;Sahmel and Boeniger 2015).
Since dermal exposure limits and guidelines (other than qualitative Skin Notations) are often unavailable, an alternative method must be used to establish exposure acceptability criteria (NIOSH 2009). In certain instances, an airborne occupational exposure limit (OEL) may be an appropriate basis for developing an acceptability criterion for dermal exposures. Such an approach is appropriate in scenarios where the contaminant or compound of concern has the potential to cause systemic effects, such as carcinogens or compounds with a known target organ (Sahmel et al. 2009). Note that such an approach would not be appropriate for scenarios in which the compound or agent of concern is associated with a localized response such as irritation, corrosion, or contact allergens. It is also important to review and understand the basis for the OEL assignment, and to ensure that the analysis was based on systemic toxicity by more than just the inhalation route; if not carefully reviewed, factors such as percent absorbed through inhalation may make an inhalation OEL unsuitable for adaptation to the dermal route (Sahmel et al. 2009). However, if determined to be appropriate, the following relationship (1) can be used: The appropriate inhalation rate can be estimated using the data in the U.S. EPA's Exposure Factors Handbook, and can range from approximately 4-32 m 3 of air per 8-hr workday, depending on the exertion level of the activities being performed, or an average of 15-17 m 3 per 24-hr day (EPA 2011).
Exposure acceptability judgments (whether for dermal or inhalation exposure assessment) can be ranked according to four categories . When performing quantitative analyses of exposure, these category judgments are determined using an upper tail statistic decision based on the available quantitative exposure data or model output for a specific similar exposure group (SEG) using the 90 th , 95 th , or 99 th percentile of the decision metric, which in this case is the OEL or dermal/systemic OEL equivalent. The 95 th percentile is the most common decision statistic.
These categories include an identified acceptability rating of 1 or 2 (typically considered an acceptable exposure relative to the OEL when variability is not large), 3 (uncertain acceptability or moderate to high exposure relative to the OEL), and 4 (unacceptable exposure relative to the OEL) ( approach is commonly used by exposure assessment practitioners to evaluate the acceptability of inhalation exposures in a variety of scenarios, it is less commonly used to evaluate the acceptability of dermal exposures.
There are concerns that when evaluating dermal exposure potential specifically, exposure practitioners may more frequently perform these assessments or make decisions about exposure acceptability without the benefit of any substantial background knowledge or training about dermal exposure assessment, and also without the benefit of any exposure assessment tools or frameworks. For the purposes of the current study, the accuracy of dermal exposure assessments or judgments was defined as the ability of the participants to identify the correct or reference exposure category, as defined by the quantitative exposure banding categories described above used by the AIHA. The purpose of this analysis was to evaluate the accuracy of the judgments made by exposure assessment practitioners when evaluating the acceptability of dermal exposures using this framework, in order to understand what information was most influential to their decision-making process and what information or tools were required for them to reach an accurate determination about dermal exposure potential.

Materials and methods
Two workshops were held with participants from the American Industrial Hygiene Conference and Exposition in 2013 and 2014. Institutional Review Board (IRB) approval was obtained from the University of Minnesota before the study (#1212M25182). Statistical power calculations demonstrated that a minimum of 16 volunteers was likely to be needed for sufficient statistical power for a geometric standard deviation of 4 or less, with a 95% decision statistic and an error rate of 2% or less. Of the 112 total workshop participants, 90 volunteers participated in the dermal exposure assessment workshop exercises. Before the start of each workshop, participants were asked to provide their level of formal education (i.e., associate degree or no degree, undergraduate degree, Master's degree, or doctoral degrees such as PhD or ScD), as well as their level of professional certification [i.e., certified industrial hygienist (CIH) or certified safety professional (CSP)]. Respondents were also asked to respond to five questions (shown in Figure 1) regarding their experience and training related to dermal exposure assessments specifically, including: (1) prior formal education in dermal exposure assessment; (2) prior professional training in dermal exposure assessment; (3) experience in performing qualitative dermal exposure assessments; (4) experience in performing quantitative dermal exposure assessments; and (5) their level of comfort or certainty in making dermal exposure judgments. The questions were designed to elicit specific differences in training by education level or certification in dermal exposure assessment capabilities and experience.
The participants were randomly assigned to evaluate one of three previously-identified dermal-specific occupational exposure scenarios: (1) shell core molding in an iron foundry using a phenol-containing resin (evaluation of dermal phenol exposure potential); (2) form molding of component parts using advance composites for the aerospace industry (evaluation of dermal methylene dianiline or MDA exposure potential); and (3) maintenance pitching (placement of partition components into pipes) at a petrochemical plant (evaluation of dermal benzene exposure potential). For each scenario, an appropriate reference category (based on Table 1) was identified before the workshop using quantitative exposure information, and the bases for these determinations are shown in Table 2. As noted in Table 2, a reference category of 1-4 was determined for each scenario using a combination of biological monitoring data and scenariospecific dermal loading measurements, and the determined reference values are reported in Table 3 (Piotrowski 1971;Boeniger et al. 1984;Boeniger and Klingner 2002;Klingner and Boeniger 2002;van Wendel de Joode et al. 2005a;Arnold et al. 2017). The reference values were not shared with the participants until the conclusion of the workshop. A total of 28, 28, and 34 participants were assigned to the first, second, and third scenarios, respectively, exceeding the minimum number of participants identified in the power calculations for each scenario.
The participants were asked to make a total of five exposure acceptability determinations or judgments for the same scenario, and each exposure judgment was made with progressively more information, training, data, and assistive tools to aid in the judgment and acceptability process (following the schematic in Table 2). The total workshop duration was approximately 8 hr. At each point in the workshop where a dermal judgment acceptability determination was required, the participants were asked to assign their judgment to one of the four categories of exposure acceptability (Table 1). To track the judgment process, the volunteers formally logged judgments at periodic points during the workshop using polling software (Poll Everywhere, San Francisco, CA). As noted above, the accuracy of dermal exposure assessments or judgments made by the participants was defined as the ability of the participants to identify the correct or reference exposure category.
For the initial judgment, participants were provided with basic scenario information and physicochemical properties information about the substances of interest and were asked to make their best determination of acceptability regarding the dermal exposure potential. The Online Supplemental Information on Scenarios Evaluated by Participants shows the actual information provided to the participants for consideration in each scenario. Although not provided to the participants until the end of the workshop, the Supplemental Information also includes the reference category for each scenario, calculated based on biological monitoring and scenario-specific dermal sampling data for each scenario as previously described. For the second judgment, the participants were provided with training on qualitative dermal exposure assessment determinants and given an interactive tool for making qualitative dermal exposure assessments (Dermal Exposure Assessment Model (DRAM), American Industrial Hygiene Association (AIHA), Fairfax, VA, USA), and then asked to determine the acceptability of the exposure by refining their previous judgment. For the third judgment, the participants were provided with quantitative dermal exposure assessment training that built on the concepts in the qualitative training, along with an exposure model (the AIHA conceptual dermal exposure assessment model) Sahmel and Boeniger 2015) and default values for dermal loading available in the published literature USEPA 2011). For the fourth judgment, the participants were provided with a set of scenario-specific quantitative dermal and surface sampling measurement data and IH STAT, a quantitative industrial hygiene data statistics   (Arnold et al. 2017;Jahn et al. 2015), compared against the ACGIH V R TLV V R for phenol using Table 1. Determination also evaluated for consistency against published biological monitoring data for phenol (Piotrowski 1971 tool (IH STAT, AIHA, Fairfax, VA, USA)  to allow for quick trials of various quantitative model inputs. And finally, for the fifth judgment, participants were provided with an interactive tool, the IH SkinPerm model (Tibaldi et al. 2014), which allowed the user to identify and calculate parameters for dermal exposure loading and uptake through the skin (IH SkinPerm, AIHA, Fairfax, VA, USA) (Tibaldi et al. 2017). To address exposure time, the participants in the study were informed that the activity described in each scenario was the specific job of the individual performing the work. The participants were then asked to make judgments based on the scenario about the dermal exposure time using the described nature of the work. Areas of skin contact were not provided, as this was part of the assessment task for each participant. The participants were asked to determine an appropriate environmental temperature and consider it for each of the scenarios based on their professional experience. Given the potential for uncertainty in the effectiveness of personal protective equipment (PPE) such as gloves, participants were asked to evaluate the potential for dermal exposure without the use of PPE, even if it was depicted in available scenario photos. Following each individual judgment determination and the collection of these individual judgments, concluding the data collection phase of the study, the aggregated results were shared in real time with the other workshop participants. This process helped to maintain interest and engagement in the workshop and also allowed for discussion among the participants and instructors that contributed to the training and education of the practitioners on the topics addressed.

Results
A total of 70.4% of volunteer respondents reported having obtained a Master's degree, 25.4% reported having obtained an undergraduate degree, and 4.2% reported having either no degree, an associate's degree, or a doctoral degree (i.e., PhD or ScD). A total of 71.8% reported being certified as a CIH and 31% reported being certified as a CSP (Figure 2).
For the five dermal-specific exposure assessment screening questions, 71% of volunteer respondents reported having received no formal education in performing dermal exposure assessments, while 25.4% reported receiving some formal qualitative education on dermal exposure assessments and 3.6% reported receiving some formal quantitative education.  As determined using a combination of biological monitoring data and scenario-specific dermal loading data; Table 2.
Regarding prior professional training in dermal exposure assessment, 76% reported receiving no prior professional training and 3.7% had attended a course specific to dermal exposure assessment, while the remainder had received some training as part of another course or training effort. With respect to prior experience in performing qualitative dermal exposure assessments, 40.6% reported that they had never previously performed a qualitative dermal assessment, while 52.2% reported using professional judgment to perform a dermal exposure assessment, and 7.2% had previously used a qualitative dermal assessment tool or system. When asked about past experience performing quantitative dermal exposure assessments, 81.7% of participants reported never having done a quantitative dermal exposure assessment; 7% had performed some type of dermal sampling; and another 11.3% reported performing other types of quantitative dermal exposure assessments. A majority (52.1%) reported that they were "not at all comfortable or certain" in performing dermal exposure assessments, while 36.6% were "slightly" comfortable regarding dermal assessments and 9.9% were "somewhat" comfortable or certain in performing dermal exposure assessments. One participant (1.4%) reported that they were "very" comfortable with the dermal exposure assessment process (Figure 3). Given the low level of formal education or professional training in dermal exposure assessments reported by most participants, as well as the low level of experience that most participants reported in performing dermal exposure assessments, further stratifications of these results by such factors as professional certification or level of education were not performed. The mode (or majority judgment determination) for the participants across the three scenarios evaluated indicated that for both the initial and qualitative dermal judgments (Judgments #1 and #2, but particularly for Judgment #2), participants consistently judged the dermal exposure for their assigned scenario to be at a moderate to high-risk level (category 3 or 4), regardless of the information provided (Table 3 and Figure 4). For two of the three scenarios (scenario 1 and scenario 3), practitioners overestimated the reference category by two to three categories. The third scenario had a reference category of 4, and so for this scenario, the reference category was either accurately estimated or underestimated, but it was unclear if this was due to an accurate estimate of exposure or was the result of the same qualitative judgment approach used generally by the practitioners for all scenarios. It was therefore observed that practitioners provided a category judgment that overestimated the "true" or reference dermal exposure category for 2 of the 3 scenarios evaluated (Table 3). Contrary to this result, Arnold et al. (2016) reported that 50.8% of practitioners underestimated the reference exposure category in baseline inhalation exposure assessments.
When the participants were provided with a semiquantitative approach and input tool to assess dermal exposure for their assigned scenario, along with the available default parameters for dermal loading and dermal absorption (Judgment #3), the mode of the acceptability judgments for Scenarios #1 and #2 rose to a category 4, or unacceptable exposure, and were split across categories 2, 3, and 4 for Scenario #3, although a majority of the judgments for this scenario were also category 4. Therefore, using the available default parameters in Judgment #3 for determining dermal loading, all three scenarios (#1, #2, and #3) were judged to be "unacceptable" for dermal exposure or approaching "unacceptable" (Figure 4).
Following this semi-quantitative modeling judgment using default inputs, the participants were provided with real-world sampling data for skin and surface loading for their respective scenarios (Judgment #4). With the addition of real-world sampling data to the assessment process, a majority of participant judgments for Scenarios #1 and #3 dropped by three categories, from a category 4 to a category 1. For Scenario #2, the split rating between category 3 and 4 increased to a category 4 (see Figure 4).
The addition of a dermal loading and absorption tool (IH SkinPerm) for use in the final judgment (Judgment #5) did not change the majority of the participant judgments, which remained at category 1 for scenarios 1 and 3, and a category 4 for scenario 2. Further, the addition of the training and quantitative tool associated with Judgment #5 (IH SkinPerm) did not decrease the accuracy in the dermal judgment output and was not observed to negatively affect the accuracy of the participant judgments ( Figure 4). For Scenarios 1 and 3 this result was likely observed because even if 100% of the amount loaded onto the skin had been predicted by IH SkinPerm to be systemically absorbed, it would not have changed the category 1 determination obtained through either IH STAT or IH SkinPerm. It seemed, however, for Scenario 2, which had a reference exposure category of 4, that the analysis from IH SkinPerm did influence the participants, and depending on the input parameters used by the individual practitioner, some of the participants reduced their judgment from a category 4 to a lower category. However, these category judgment changes did not change the category determination by the majority of participants assigned to evaluate scenario 2.
Further, it should also be noted that the default loading values in the IH SkinPerm model are consistent with other default values provided to the participants for use in the quantitative modeling calculations (1 mg/cm 2 /h, and a maximum of 7 mg/cm 2 ). It is therefore expected that the use of IH SkinPerm alone to assess dermal exposure potential, without the addition of scenario-specific (or real-world) sampling data, may have shown some degree of overestimation, consistent with the semi-quantitative dermal exposure assessment modeling results using available default parameters. It was for this reason that the IH SkinPerm tool was purposefully introduced after providing the participants with real-world sampling data. Given that the participants in this study were trained on the use of IH SkinPerm after they received real-world sampling data, it is also possible that some number of participants chose to voluntarily change the loading value in IH SkinPerm from the default value to one that was consistent with the measured scenario-specific loading data previously received, which would have had the effect of making the results of judgments 4 and 5 more consistent with each other across all scenarios.
Additionally, IH SkinPerm was used to train participants regarding several interesting nuances associated with dermal exposure assessment for the scenarios evaluated. For example, for scenario 1, IH SkinPerm showed differences in the relative quantitative dermal uptake prediction for phenol vapor, which was higher than the dermal uptake prediction for phenol liquid, and also demonstrated how, although still a category 1 exposure, the dermal exposure of phenol was predicted to be higher than the inhalation exposure. For scenario 2, IH SkinPerm showed participants the pattern of MDA uptake through the skin and predicted that there was little evidence for any vapor or inhalation exposure, in addition to supporting the selection of reference category 4. And for scenario 3, IH SkinPerm predicted that minimal vapor absorption of benzene would occur due to its extremely high vapor pressure, supporting a category 1 exposure judgment.
Overall, the model-predicted output (concerning the reference category, as determined by quantitative measurement data before the workshop, Table 3) improved most significantly when dermal and surface loading measurement data were made available to the participants for use in the assessments, and changed by up to three categories for two of the three scenarios. Overall, the results indicated that with scenario-and agent-specific quantitative dermal loading data, a majority of participants were able to identify the correct exposure category for all three different dermal exposure scenarios, despite minimal prior education or training on either qualitative or quantitative dermal exposure assessments for a majority of participants.

Discussion
This study examined the ability of exposure assessment practitioners to make accurate judgments about the model-predicted exposure category for dermal exposure and uptake using available deterministic models for dermal exposure assessment. Several important findings were noted in this study. First, the participants reported very little experience with dermal exposure assessments, despite often having years of professional experience, as evidenced by certifications and relevant degrees. Second, contrary to the experience of judgment accuracy investigations for inhalation exposure assessments, the participants in this study tended to overestimate the potential for dermal exposure when they had limited information to perform the assessment. Third, the availability of appropriate scenario-specific data, and specifically skin surface loading data, enabled a majority of participants to reach the correct judgment regarding the accurate dermal exposure category, despite being incorrect by as many as three categories in their initial judgments. Further, although the size of the cohort evaluated in this initial study was small, it identified some potentially important limitations in the current methodologies used for dermal exposure assessment that should be addressed before a larger study is undertaken to determine the accuracy of the current deterministic models used for dermal exposure assessment.
Regarding the first of these findings, initial training and experience in dermal exposure assessments were minimal for a majority of the participants. Similar results have been noted in previous studies; in an evaluation of Tier 1 REACH assessment tools, Lamb et al. (2017) reported that more participants in the evaluation process noted "major uncertainty" in choosing the dermal activity or task compared to the inhalation activity/task. Concerning the level of practitioner experience and its effects on judgment accuracy, Vadali et al. (2012a) noted that practitioners with experience or expertise often use that knowledge to identify patterns in problems or questions to arrive at a judgment regarding appropriateness or accuracy. The results of this analysis suggested that such expertise or training in dermal exposure assessments, both perceived and real, may be lacking among exposure assessment practitioners in the United States. It has previously been reported that total years of experience in performing exposure assessments, as well as the possession of relevant professional certifications, were significantly associated with judgment accuracy (Vadali et al. 2012a;Sakhvidi et al. 2015). In this study, a near complete lack of experience with dermal exposure assessments was reported despite participants with substantial levels of educational attainment and professional certification in safety or industrial hygiene, as reported by a significant majority of the participants. As a result, the progressive addition of information and tools to the participants in this study for performing dermal exposure assessments likely contributed significantly to each participant's knowledge and information. Consistent with this finding, de Cock et al. (1996) noted that when asked to evaluate a scenario related to pesticide exposures, a group of exposure assessment practitioners adjusted their estimates of the relative contributions to exposure from different routes as they received additional information. Initially, the practitioners determined that the majority of exposure would be associated with the inhalation route, but later adjusted their judgment to indicate that dermal exposure was likely to be more significant. This adjustment may have been due to biases related to past education on exposure assessment that focused on inhalation exposure potential (de Cock et al. 1996).
In the second significant finding of this study, the participants systematically overestimated dermal exposure potential when presented with limited or nonspecific information to assess each dermal exposure scenario. This effect is different from that observed for inhalation exposure assessments, where practitioners have tended to underestimate exposures, particularly for initial judgments (Arnold et al. 2016). Interestingly, the lack of background or expertise in dermal exposure assessments may have reduced potential biases related to prior assessments or experience that can result in inconsistent or inaccurate judgments by experienced participants for inhalation exposure scenarios. A similar effect was observed by Arnold et al. (2016) for novice exposure assessment practitioners compared to experienced practitioners.
Third, an understanding of the critical inputs needed, and the process with which those critical inputs must be used, were observed in this study to be the most important factors in accurately judging dermal exposure scenarios rather than level of training or experience. Unlike previous studies of exposure judgment accuracy, the initial judgments in this study were routinely assigned by participants at a moderate to high exposure category regardless of the reference exposure category for the scenario. The progressive addition of tools and information for the dermal exposure assessment scenarios evaluated in this workshop essentially encouraged the participants to follow an appropriate methodological algorithm or succession for assessing dermal exposures rather than disproportionately applying "professional judgment," with its inherent biases, based on previous assessments or experience. When scenario-specific quantitative data were provided to the participants, a majority of the participants were able to identify the correct reference category for all three scenarios evaluated. These results indicate that existing default data to evaluate dermal exposures do not necessarily allow for an accurate analysis of dermal exposure potential and appeared to overestimate or substantially overestimate total dermal exposure. The determination and use of default or baseline parameters for dermal exposure assessment were therefore observed to be a substantial barrier to accuracy in this study. Marquart et al. (2006) also noted that there can be questions around defining default parameters, such as for dermal loading or transfer. In the RISKOFDERM tool, researchers defined parameters consistent with "typical exposure," and "reasonable worst case" and noted that "reasonable worst case" is likely used more often than "typical exposure." It was also acknowledged that a large amount of "expert judgment" is involved in determining these values and indicated that substantial variability in estimated dermal exposure levels can be expected .
For this particular analysis, information on several quantitative default parameters was initially provided to the participants, including skin surface area of contact, dermal concentration or loading, and potential for dermal uptake. Accurate semi-quantitative or quantitative information on dermal loading or transfer was observed to be the critical input required by a substantial number of practitioners to reach an accurate judgment regarding dermal exposures in several specific scenarios. These results are consistent with Xue et al. (2006), who found that the two most important parameters in determining children's exposure were the surface residue-to-skin transfer efficiency and surface residue levels. This finding is also consistent with de Cock et al. (1996), who reported that dislodgeable foliar residue was found to be the most significant determinant of pesticide exposure. It has also been reported that, in actual practice, the amount of a substance that is removed from the skin correlates with the amount absorbed (Brouwer et al. 1998(Brouwer et al. , 2000. In an evaluation of dermal exposure assessment for occupational epidemiology research, Vermeulen et al. (2002) similarly reported that the intensity of a contaminant on the skin surface was either very or extremely important for the estimation of dermal exposure, but rated the level of knowledge regarding this parameter as either poor or limited.
Additionally, some potentially significant limitations were identified in the underlying deterministic dermal exposure assessment methods used for this analysis and available in the exposure assessment literature. This is particularly true concerning the default dermal loading values that have been suggested for use in these dermal models, and the related assumptions about loading in repeated or routine contact scenarios. In evaluating the variability of taskbased estimates of dermal exposure for different industries, Kromhout et al. (2004) found that differences in dermal exposure can range over three to five orders of magnitude for groups or scenarios with similar exposure patterns. They found that these differences were directly influenced by the local conditions surrounding the actual handling of a material, as well as the material's physical/chemical properties, and concluded that "actual dermal exposure measurements" and an improved understanding of the determinants of dermal exposure were required for appropriate dermal exposure assessments. Similarly, Blanco et al. (2005) reported that work practices and the type of equipment used accounted for a substantial amount of the dermal exposure variability. In evaluations of the dART tool, researchers noted that the vast majority of dermal loading or transfer measurements were collected using cotton gloves, an interception method that has been previously noted to overestimate or significantly overestimate dermal loading (Gorman Ng et al. 2014;McNally et al. 2019). While such overestimates may have a small influence on the relative amount of dermal exposure for various tasks (depending on the absorptive properties of the cotton glove samplers over time, as well as the duration of sampling), their use makes the determination of true quantitative loading on the skin surface difficult .
Given the apparent influence of accurate dermal transfer or loading data on arriving at an appropriate dermal exposure judgment for a particular scenario, improved default data and training concerning the ranges of dermal loading and transfer should be developed to assist practitioners in improving the quality of judgments for dermal exposure scenarios. Another critical need is skin surface sampling data for a larger range of substances. As noted by Beamer et al. (2009), part of the difficulty in adequately quantifying transfer efficiency is that it may be related to several different parameters, including surface characteristics; the nature of the contact (including frequency, duration, and pressure); time since application or contact; and temperature and humidity. The total skin loading over an accurate estimate of skin surface area is also critical . While the range of scenarios evaluated in this study was limited and the number of participants was relatively small, increasing the risk for selection bias, the consistency of the results points to the specific types of data inputs that may improve the ability of exposure assessment practitioners to select the appropriate category for dermal exposure acceptability and improve the utility of the available deterministic dermal models. A careful evaluation of the existing dermal exposure assessment tools is needed with a focus on improving the quality of the existing default parameters, particularly concerning dermal loading or transfer, given the low accuracy of the current deterministic models in the absence of scenario-specific data.

Conclusions
The results of this study demonstrated that despite substantial levels of education and training in safety and industrial hygiene generally, the practitioners evaluated had very little experience in performing dermal exposure assessments. Further, contrary to studies in the literature evaluating the initial accuracy of practitioners in performing inhalation exposure assessments in which the practitioners tend to underestimate exposure potential, it was observed that the participants in this study consistently overestimated the potential for dermal exposure without quantitative data specific to the scenario of interest. This finding is similar to the results reported by Arnold et al. (2016) for novice exposure assessment practitioners compared to experienced practitioners. The difference between the outcomes of judgments for experienced vs. novice practitioners (i.e., often underestimating vs. overestimating exposures) should be further explored in future research on the potential for biases in professional judgments by practitioners, particularly in the absence of quantitative data. Importantly, participants in this study were able to identify the reference or "true" category of dermal exposure acceptability when provided with relevant, scenario-specific dermal and/or surface-loading data for use in the assessment process. As previously noted by Arnold et al. (2016) and others, the accuracy of exposure assessments is fundamental to supporting both epidemiological studies and the risk assessment process, and being able to characterize the accuracy of these assessments. The results of this analysis support the need for a closer examination of default skin loading and other values used in dermal exposure assessments to evaluate their accuracy relative to real-world or measured dermal loading values.