4. Data Evaluation

Data evaluation may include the following tasks:

This chapter provides guidance on key issues associated with evaluating data to develop a risk assessmentAn organized process used to describe and estimate the likelihood of adverse health outcomes from environmental exposures to chemicals. The four steps are hazard identification, dose-response assessment, exposure assessment, and risk characterization (Commission 1997a). that is representative of potential risks and acceptable to the stakeholdersA stakeholder is anyone who has a “stake” in the development, outcome or decisions made as a result of a risk assessment. A stakeholder can be a person, a group, or an organization that is either affected, potentially affected, or has any interest in the project or in the project’s outcome, either directly or indirectly (Commission1997a; Commission 1997b; NRC 1996; NRC 2009).. The key issues are organized around five general topic areas:

Data Gaps

Data Usability

Data Reduction Concerns

Data Visualization and Analysis

Data Screening and Chemical Selection Processes

In many cases, the only data available for risk assessment are data collected as part of site characterization efforts, without considering risk assessment needs. A thoughtful data planning process (systematic planningA planning process that is based on the scientific method. It is a common-sense approach designed to ensure that the level of detail in planning is commensurate with the importance and intended use of the data, as well as the available resources. Systematic planning is important to the successful execution of all activities at hazardous waste sites, but it is particularly important to dynamic field activities because those activities rely on rapid decision-making. The data quality objective (DQO) process is one formalized process of systematic planning. All dynamic field activities must be designed through the use of systematic planning, whether using DQO steps or some other system. See also Data Quality Objective (USEPA 2015h)., see Section 3.3) is needed to develop a data set that can support both site characterization and risk assessment.

4.1 Data Gaps

A data gap exists when data or other site information is insufficient for adequately evaluating each potential exposure pathwayThe course a chemical takes from a source to a receptor. An exposure pathway describes a unique mechanism by which an individual or population is exposed to chemicals at or originating from a site. Each exposure pathway includes a source or release from a source, an exposure point, and an exposure route. If the exposure point differs from the source, a transport/exposure medium (for example, air) or media (in cases of intermedia transfer) also is included (USEPA 1989a). identified in the CSM. Data gaps introduce some level of uncertainty; however, the project team can determine whether the level of uncertainty associated with the data gap is acceptable or if additional data must be collected. Data gaps in field data are generally filled by collecting additional field data (for example, soil, groundwater, or air samples). Data gaps in other information are generally filled through online or library research and through additional data requests to agencies or organizations that collect, develop, store, or otherwise manage the needed information.

4.1.1 Issue – Identifying and Filling Data Gaps

Data gaps can be defined as questions that cannot be answered based on existing field investigation data or other available information (for example, maps, site development plans, and demographic information). A data gap in the context of risk assessment is missing information that, if available, would allow a more refined analysis to be completed.

4.1.2 Issue – Addressing Permanent Data Gaps

Permanent data gaps are data gaps that cannot be resolved. These gaps can be the result of lack of information concerning the site history, future land uses, or site-specific sampling information. For some investigations, data gaps in sampling information may exist because areas of the site are inaccessible. Sites may be inaccessible, for example, because a property owner has refused right of entry, sampling locations are too close to sensitive areas such as utilities, or unstable conditions such as severe slopes exist. Depending on the type of site, nature of the data gap, and concentrations of chemicals in environmental media, permanent data gapsData gaps that cannot be resolved due to lack of information such as lack of information concerning the site history, future land uses, or from site-specific sampling information. associated with concentrations of chemicals can be handled in several ways.

4.2 Data Usability

Data used for risk assessment must be validated to ensure that data are usable, in some cases requiring a third-party independent reviewer. The USEPA offers guidance (Chapter 5 of USEPA 1989a) on data evaluation necessary for risk assessment. Part of this data evaluation includes an assessment of data usability. Data usability is determined based upon certain QA/QC criteria. These QA/QC criteria include sampling and preservation requirements, detection limit adequacy, laboratory and matrix spike recovery accuracy and precision, and evaluation of blanks. Establishing appropriate QA/QC criteria is an important consideration during the project planning phase and, ideally, should be determined before samples are collected and analyzed. Numerous data QA/QC criteria guidance resources are available, such as USEPA’s Data Quality Assessment, USEPA’s National Functional Guidelines (USEPA 2008b; USEPA 2010e; USEPA 2011h), Department of Defense Quality Systems Management (DOD QSM), as well as other federal and state guidance (USEPA 2006b; USDOD 2013).

4.2.1 Issue – Presenting Measurement Units and Significant Figures

A number of issues arise with units and significant figures used for reporting laboratory data.

4.2.2 Issue – Determining Cross-Contamination

In order to determine its usability for risk assessments, site data should be evaluated for possible contamination that is the result of the field sampling or laboratory processes. To evaluate sample data for cross contamination, several types of blank sample analyses can be used along with the site samples, including laboratory blanks, trip blanks, and rinsate/equipment blanks. Site data may be either accepted for the risk assessment or rejected based on any detections reported in the blank analyses.

4.2.3 Issue – Assessing Data Representativeness

The USEPA defines representativeness as a data quality indicator of the degree to which data accurately and precisely represents a characteristic of a population (USEPA 2006c). This indicator answers the question: “Are the samples representative of actual site conditions?”

4.3 Data Reduction Concerns

Data reduction is an early step in evaluating the data used to develop a risk assessment. Reduction involves the processing of data produced by the analytical laboratory to create a data set that can be used to assess human health risks. The data set is then used to generate maps or other visual aids to understanding risk, summary statistics, statistical graphics (for example, box plots or histograms), and to perform statistical analyses. Data reduction activities may include handling of duplicate sample analyses; merging of data generated from more than one sampling event, sampling method, or analytical method; and the organization of the data.

The risk assessment report should provide a thorough discussion of data reduction methods. The product of the data reduction is a data set organized and formatted in a way that facilitates statistical analysis and visualization (for example, mapping).

4.3.1 Issue – Using Duplicate Samples

Different types of duplicate samples indicate precision in analytical measurements. Colocated field duplicate samples are collected in the field analysis and laboratory duplicate samples are collected from the same prepared sample. The number of colocated duplicate samples should be defined as part of the data collection program.

4.3.2 Issue – Pooling Data

To generate a data set that is more representative of temporal or spatial variability in site conditions, or both, it may be useful to pool or combine data. These data may have been generated by different analytical methods, collected at different times, collected at different locations (for example, pooling data from a group of groundwater monitoring wells), collected by different organizations, or collected for different purposes.

4.3.3 Issue – Handling Flagged Data Concentrations

Some laboratory analyses generate concentration measurements that are below the laboratory detection limits (for example, method detection limit or reporting limit; see Section 5.7.1, ITRC 2013. Concentrations reported at less than detection limits are called “censored” or “flagged” data. In addition, during the data validation process, data are validated for compliance with the analytical method requirements and data qualifier “flags” are applied to the data to indicate QA/QC issues to consider when using the data.

4.3.4 Issue – Handling Nondetect Concentrations

Often, reportable concentrations do not occur for a chemical reasonably expected to be present in the environmental media. These data are referred to as “nondetects.” Nondetects provide valuable information, and the approach to handling nondetects can change the outcome of a risk assessment. Several methods may be considered for handling nondetect data.

4.3.5 Issue – Considering Outliers

Outliers are extremely large (or small) measurements relative to the rest of the data in a data set and have the potential to misrepresent (bias) the population from which they were collected. For additional information about outliers in environmental data, see ITRC 2013 and USEPA 2010c.

4.3.6 Issue – Addressing Tentatively Identified Compounds

Laboratories calibrate for a specific target analyte list based on the analytical method or upon request, and only those analytes are reported quantitatively. Tentatively identified compounds (TICs) are nontarget chromatographic peaks detected during gas chromatography/mass spectrometry (GC/MS) analysis. TICs may be qualitatively identified by searching the National Institute of Science and Technology (NIST) (NIST 2014) or similar mass spectral library. Estimated concentrations for TICs are calculated similarly to the target compounds. These estimated concentrations, however, should not be used in calculating risk. Thus, target analyte lists should be reviewed to ensure that chemicals actually used at a site are included on the analyte list, even if those chemicals may not be found on common target lists.

4.3.7 Issue – Assessing Nonspecific Methods

Data from nonspecific methods such as those used to assess certain classes of chemicals (for example, petroleum hydrocarbons, PCBs, and certain metals such as chromium and mercury) can be problematic in risk assessment:

4.4 Data Visualization and Analysis

Data visualization and data analysis generally include the use of tabular, graphical, spatial, and statistical tools to:

4.4.1 Issue – Accurately Displaying and Visualizing Data

Data visualization reveals site-specific patterns that might not otherwise be observed in a report or a tabular summary of data.

4.4.2   Issue – Statistically Analyzing Data

Agencies may have specific guidance, policy, or regulations about statistical approaches. Statistical analysis may require one or more of the following:

4.5 Data Screening and Chemical Selection Processes

Risk assessment generally begins with a conservative step. Screening values (based on default assumptions as opposed to site-specific values) are used to identify areas and chemicals in environmental media warranting further evaluation and to assess the adequacy of sampling data collected (for example, whether the nature and extent of concentrations are adequately characterized). The subsequent steps typically use calculations or models and more site-specific exposure assumptions and intake variables. This screening approach allocates resources based on the apparent significance of the concentrations of a chemical and its presence in an environmental medium.

4.5.1 Issue – Identifying Appropriate Screening Values

There may be federal, state, or even local screening values that may apply to the site and that may differ. Additionally, there may be more than one screening value for a single chemical in a given medium to address multiple exposure scenarios. For example, based on current and reasonably anticipated future land use, chemicals may need to be compared to both residential and nonresidential screening values. In another example, for CERCLA sites, CERCLA Section 121(d) stipulates that applicable or relevant and appropriate requirements (ARARs) be met, meaning that levels must be applicable to the site, or relevant and appropriate to the chemical or contaminated media at the site.

4.5.2 Issue – Identifying Chemicals for Evaluation in the Risk Assessment.

Although the screening values may differ among agencies, the initial screening evaluation (typically part of a Tier 1 evaluation) is generally well-defined and consistent. Typically, “the number of chemicals to be considered during the remainder of the risk assessment will be less than the number of chemicals initially identified” (USEPA 1989a).

4.5.3 Issue – Addressing Chemicals that are Missing Screening Values

In the chemical screening step, analytes are often missing screening values. In some cases, these values are missing because toxicity or other data necessary to calculate a screening value are lacking. In other cases, toxicity data may be available, but a screening value has not been developed.

4.5.4 Issue – Handling Nondetect Chemicals in Screening

When samples are analyzed for extended lists of analytical parameters, typically a fairly large group of chemicals are identified as nondetects in all samples. Sometimes the applicable regulatory context dictates the treatment method for nondetects in the screening process. When the regulatory context does not dictate the approach, however, several methods are used to evaluate nondetects in the screening step.

4.5.5 Issue – Addressing Data Bias in Screening Process

Data bias in the screening process can significantly affect risk assessment conclusions. Biases should be accounted for in the risk assessment.

4.5.6 Issue – Handling Background Concentrations

It is common for chemicals such as metals or polycyclic aromatic hydrocarbons (PAHs) to be identified at sites at concentrations higher than risk-based screening values due to naturally occurring or non-site-related anthropogenic conditions. Most state and federal remedial action programs do not require remediation of impacts below background or in some cases only require remediation to address site-related impacts.

Some states may have published or unpublished approaches for comparisons of site and background data sets. Various references are available providing approaches for comparison to background (for example, DTSC 1997, DTSC 2008, DTSC 2009b, and DTSC 2009a).

If limited background data are available, then the background screening may be performed by using a point-by-point comparison or by comparing the maximum detected site concentrations to a background value (see Section 3.3.7 and Section 6.2.5).

4.6 Resources and Tools

The following resources and tools were not cited in the sections above and are included here for further information.

Spatial Analysis and Decision Assistance (University of Tennessee 2013)

ProUCL software (USEPA 2013d)

Drinking Water Standards (USEPA 2014c)

Preliminary Remediation Goals for Radionuclides (USEPA 2014i)

Regional Removal Management Levels for Chemicals (USEPA 2014k)

Risk Assessment Information System, Oak Ridge National Laboratory (ORNL 2014)

Using the Triad Approach to Streamline Brownfields Site Assessment and Cleanup – Brownfields Technology Primer Series (USEPA 2003f)

Publication Date: January 2015

Permission is granted to refer to or quote from this publication with the customary acknowledgment of the source (see suggested citation and disclaimer).


This web site is owned by ITRC.

1250 H Street, NW • Suite 850 • Washington, DC 20005

(202) 266-4933 • Email: [email protected]

Terms of Service, Privacy Policy, and Usage Policy


ITRC is sponsored by the Environmental Council of the States.