This script does a validation check of the soy ET, water footprint (WF) or similar calculated for TraseH2O. Values of soy ET were obtained following these steps:
Here we validate the results with the following datasets:
The WFN data provides soybean WF per state for the 1996-2005 period. ET is modeled using FAO56 and spatialized with Aquacrop. We can compare the two but the time period of the dataset is different to our study (2014-2017).
Here we compare the total water footprints.
##
## Call:
## lm(formula = wfn_validation$soy_wf_state ~ wfn_validation$wf_1996_2005)
##
## Residuals:
## Min 1Q Median 3Q Max
## -165.747 -52.904 2.624 44.678 159.549
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3014.29787 188.21569 16.015 0.000000000614 ***
## wfn_validation$wf_1996_2005 -0.51469 0.09098 -5.658 0.000078304900 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 96.65 on 13 degrees of freedom
## Multiple R-squared: 0.7112, Adjusted R-squared: 0.6889
## F-statistic: 32.01 on 1 and 13 DF, p-value: 0.0000783
## # A tibble: 15 × 6
## state_id sum_m3 sum_tonnes soy_wf_state state wf_1996_2005
## <chr> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 11 5553686600 3036489 1829 RONDONIA 2284
## 2 15 8753509379 4696337 1864 PARA 1941
## 3 17 19591764695 8845182 2215 TOCANTINS 1695
## 4 21 16453383121 7549759 2179 MARANHAO 1628
## 5 22 13543888092 5925826 2286 PIAUI 1725
## 6 29 32521232457 16121015 2017 BAHIA 1856
## 7 31 34223119307 16935652 2021 MINAS GERAIS 2018
## 8 35 19269255770 10200110 1889 SAO PAULO 2114
## 9 41 117785931491 68446698 1721 PARANA 2368
## 10 42 14519557421 8255519 1759 SANTA CATARINA 2315
## 11 43 114209368196 63696062 1793 RIO GRANDE DO SUL 2444
## 12 50 55704109049 30136874 1848 MATO GROSSO DO SUL 2544
## 13 51 229337383560 111104011 2064 MATO GROSSO 1924
## 14 52 79572277831 39156782 2032 GOIAS 1994
## 15 53 1627451348 872220 1866 DISTRITO FEDERAL 1909
The data seems to fall within the same range, but the time periods are not the same and so the differences could be due to meteorological differences across the years.
We compare a limited sample of water productivity data available from Figueiredo da Silva et al (2021). The authors modeled water productivity for soy in Southern and Central Brazil (mostly focused on Pampas and Cerrado biomes). Soybean development was modeled with the CROPGROW-SOYBEAN model and included field data to calibrate the model to specific cultivars. Soil water was modeled with DSSAT-SBUILD tool. This allowed for the simulation of crop growth and key variables such as LAI, above ground dry matter, grain weight which then allowed then to re-run the models with climate change scenarios.
Here we compare the total water footprints.
##
## Call:
## lm(formula = fig_validation$wp ~ fig_validation$wp_yield_kg_ha_mm_mean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.66247 -0.18555 -0.00085 0.40238 1.34254
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 5.38068 0.51743 10.399
## fig_validation$wp_yield_kg_ha_mm_mean -0.04750 0.05116 -0.928
## Pr(>|t|)
## (Intercept) 0.00000000000000322 ***
## fig_validation$wp_yield_kg_ha_mm_mean 0.357
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7277 on 62 degrees of freedom
## Multiple R-squared: 0.01371, Adjusted R-squared: -0.002198
## F-statistic: 0.8618 on 1 and 62 DF, p-value: 0.3568
Most points are out of the range of water productivity and those that
were provided by the publication are generally higher than the ones we
calculate, which suggests that our WF are greater overall. This seems to
be a recurring problem with the modeling that we are applying and when
comparing to field measurements.
Here we use the data that Rafaela generated for the moisture recycling work overlaying the Mapbiomas land use classes with the PML ET product (Zhang 2020) to get the average land use class ET. The results provide average land use class ET for each month of the year which is calculated for the “Soybean” and “Irrigated soybean” classes for the months of Oct-Jan and Nov-Feb (as potential soybean growing seasons).
Note that we need to combine 2 different years to get the correct ET; i.e. soybean planted in October 2008 can be harvested in Jan 2009 and then assigned to the 2009 harvest.
Here we compare the total water footprints.
##
## Call:
## lm(formula = mb_validation$et ~ mb_validation$et_rock)
##
## Residuals:
## Min 1Q Median 3Q Max
## -372.62 -146.67 -77.32 42.64 623.72
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.57898 20.99223 0.028 0.978
## mb_validation$et_rock 0.93424 0.03659 25.530 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 212 on 14758 degrees of freedom
## Multiple R-squared: 0.0423, Adjusted R-squared: 0.04223
## F-statistic: 651.8 on 1 and 14758 DF, p-value: < 0.00000000000000022
Some of the Mapbiomas data have low ET over its season, like ~200-300 mm
per cycle which is more in line with the field work. There are
non-sensical values as well with ET < 200 mm which might have
something to do with the land classification. The bottom line is that
our estimates are greater than those provided by Mapbiomas/PML.
ANA (2020) provided ET results for all crops in all municipalities for the 2014-2017 period. The method provides indirect estimates of crop ET following FAO56 guidelines together with meteorological data available in Brazil. The crop water requirement adjusted for water stress relies on a modification of crop coefficients by a KS factor to represent the effect of water stress on crop transpiration. This assumption presupposes a linear relationship between water use and yield and therefore differs from our previous assumption in the case of the curve from (Rockström, 2003) which does not assume a linear relationship between transpiration and yield.
Here we compare the green WF obtained from our method with the green WF derived from the ANA data (purely from effective precipitation).
##
## Call:
## lm(formula = ana_validation$soy_wf_ana ~ ana_validation$soy_gwf_rock)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8159 -255 13 192 85130
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -73.05877 53.06275 -1.377 0.169
## ana_validation$soy_gwf_rock 0.66705 0.02577 25.888 <0.0000000000000002
##
## (Intercept)
## ana_validation$soy_gwf_rock ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1289 on 8109 degrees of freedom
## Multiple R-squared: 0.07634, Adjusted R-squared: 0.07622
## F-statistic: 670.2 on 1 and 8109 DF, p-value: < 0.00000000000000022
Again, our values for green water footprint are greater than those of ANA, but ANA values can be non-sensical in some cases (e.g. WF = 3 m3 tonne-1) which likely has to do with the linear relationship assumption.
Here, the WF were calculated for both green and blue WF for soy using a 5x5 arcmin gridded dataset. Results are available for the year 2004 and 2018 (we use 2018 as the closest to 2017).
##
## Call:
## lm(formula = petrillo_validation$`uwf_green (m3/ton)` ~ petrillo_validation$soy_gwf_rock)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1378.1 -192.0 -44.3 111.1 3782.3
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 725.89418 78.72692 9.220
## petrillo_validation$soy_gwf_rock 0.41942 0.04278 9.804
## Pr(>|t|)
## (Intercept) <0.0000000000000002 ***
## petrillo_validation$soy_gwf_rock <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 383.6 on 1017 degrees of freedom
## (1256 observations deleted due to missingness)
## Multiple R-squared: 0.08635, Adjusted R-squared: 0.08545
## F-statistic: 96.12 on 1 and 1017 DF, p-value: < 0.00000000000000022
Although we are comparing different years (2017 vs 2018), our results
are overall greater than those of de Petrillo et al (2023).
##
## Call:
## lm(formula = mialyk_validation$WF_2010_2019 ~ mialyk_validation$soy_wf_state)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1075.5 -316.6 73.6 209.0 1352.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 445.4269 185.5576 2.400 0.0241 *
## mialyk_validation$soy_wf_state 0.2829 0.1071 2.643 0.0140 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 496 on 25 degrees of freedom
## Multiple R-squared: 0.2183, Adjusted R-squared: 0.1871
## F-statistic: 6.983 on 1 and 25 DF, p-value: 0.01399
## # A tibble: 7 × 16
## year trase_id cod_uf sigla_uf mun_name_c tonnes planted_ha harvest_ha yield
## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 2016 BR-2210623 22 PI SEBASTIAO… 55 315 314 0.175
## 2 2017 BR-2922250 29 BA MUQUEM DO… 110 433 433 0.254
## 3 2017 BR-5220686 52 GO SIMOLANDIA 200 665 664 0.301
## 4 2014 BR-2209757 22 PI SAO GONCA… 69 230 230 0.3
## 5 2015 BR-3515350 35 SP EUCLIDES … 135 450 450 0.3
## 6 2016 BR-2209757 22 PI SAO GONCA… 532 1430 1430 0.372
## 7 2014 BR-3129202 31 MG HELIODORA 18 45 45 0.4
## # ℹ 7 more variables: area_soy_pivot_mapb_ha <dbl>, ana_pivot_ha <dbl>,
## # soy_wf_rock <dbl>, soy_gwf_rock <dbl>, soy_bwf_rock <dbl>,
## # soy_gwf_ana <dbl>, soy_bwf_ana <dbl>
There are 20 estimates that exceed 5000 m3 tonne-1, and 7 entries with a WF > 10,000 m3 tonne-1
In all cases, yield < 1, which explains why the WF are so high, the highest WF was 23,478 m3 tonne-1 for a yield of 0.175 for 55 tonnes produced.
If we remove the WF > 10000 m3 tonne-1, then we remove 1119 tonnes of soy produced on 3568 ha in PI, BA, GO, SP, MG.
We can probably remove these outliers for graphs, making a note of it in the captions, but keeping the values in the calculations.
What most of these comparisons have in common:
The big issue with the ANA data is that there are key municipalities missing water use estimates and so the data is incomplete when comparing to IBGE (e.g. Formosa do Rio Preto in 2017).
Proposal: use the WF data obtained from the Rockström (2003) curve and remove outliers (WF > 10,000 m3 tonne-1 in the paper figures).
ANA. (2020). Uso da água na agricultural de sequeiro no Brasil (2013-2017).
De Petrillo et al (2023) International corporations trading Brazilian soy are keystone actors for water stewardship Communications Earth and Environment 87, 4(1), doi: 10.1038/s43247-023-00742-4.
Figueiredo et al (2021) Impact assessment of soybean yield and water productivity in Brazil due to climate change European Journal of Agronomy 129: 126329, doi: 10.1016/j.eja.2021.126329.
Mialyk et al (2024) Water footprints and crop water use of 175 individual crops for 1990-2019 simulated with a global crop model Scientific data, doi: 10.1038/s41597-024-03051-3.
Rockström (2003) Water for food and nature in drought–prone tropics: vapour shift in rain–fed agriculture Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 358(1440): 1997-2009, doi: 10.1098/rstb.2003.1400
Zhang, Y. (2020). PML_V2 global evapotranspiration and gross primary production (2002.07-2019.08). National Tibetan Plateau Data Center, 10.11888/Geogra.tpdc.270251