Document Details

Using machine learning to predict 1,2,3-trichloropropane contamination from legacy non-point source pollution of groundwater in California’s Central Valley

B. Hope Hauptman, Colleen C. Naughton, Thomas C. Harmon | May 6th, 2023


1,2,3-tricholoropropane (TCP) is an impurity common in nematicides applied to agricultural soils from the 1940s to the 1980s. Evidence from animal studies indicates that TCP is a probable human carcinogen. TCP leaches through the soil into groundwater where it persists and contaminates thousands of wells in Asia, Europe, and North America. In California, TCP contaminates drinking water wells, with the highest levels of TCP beneath agricultural land used to grow grapevines. This study performs a mass balance and evaluates the ability of three types of tree-based machine learning models to predict TCP concentration in California’s Central Valley aquifer system: classification and regression trees (CART), Random Forest (RF), and Boosted Regression Trees (BRT). To construct the models, multiple spatial explanatory variables were used, including historical agricultural land use data, irrigation levels, precipitation, soil type, groundwater age, redox state, and the presence of the co-contaminant nitrate. To estimate the amount of TCP applied to farmland in California, state historical pesticide and land use documents were used in the mass balance. Between 110,000 and 4,300,000 kg of TCP are estimated to have accumulated in the subsurface. Machine learning models indicate that the most important explanatory variables to predict TCP contamination of groundwater are precipitation, redox state, and the presence of the co-contaminant nitrate. Additionally, a 1000-m buffer area offers a slightly higher predictive performance as compared to 500-m and 1500-m buffers. Furthermore, the RF model outperforms CART and BRT for predictive performance. Finally, modeling using decision trees can predict TCP contamination levels in areas where monitoring is lacking, help target future TCP monitoring efforts, and aid in identifying areas to avoid when drilling new drinking water wells.

Keywords

agriculture, Central Valley, groundwater contamination, pesticides, water quality