We present the first convection-permitting regional climate model (CPRCM) simulations at 4.5?km horizontal resolution for South America at near-continental scale, including full details of the experimental setup and results from the reanalysis-driven hindcast and climate model-driven present-day simulations. We use a range of satellite and ground-based observations to evaluate the CPRCM simulations covering the period 1998–2007 comparing the CPRCM output with lower resolution regional and global climate model configurations for key regions of Brazil. We find that using the convection-permitting model at high resolution leads to large improvements in the representation of precipitation, specifically in simulating its diurnal cycle, frequency, and sub-daily intensity distribution (i.e. the proportion of heavy and light precipitation). We tentatively conclude that there are also improvements in the spatial structure of precipitation. We see higher precipitation intensity and extremes over Amazonia in the CPRCMs compared with observations, though more sub-daily observational data from meteorological stations are required to conclusively determine whether the CPRCMs add value in this regard. For annual mean precipitation and mean, maximum and minimum near surface temperatures, it is not clear that the CPRCMs add value compared with coarser-resolution models with parameterised convection. We also find large changes in the contribution to evapotranspiration from canopy evaporation compared to soil evaporation and transpiration compared with the RCM. This is likely to be related to the shift in precipitation intensity distribution of the CPRCMs compared to the RCM and its impact on the hydrological requires further investigation.
South America is a continent of great environmental and ecological importance. The Amazon Forest accounts for 40% of global tropical forest area (Marengo et al. 2018), and plays a vital role in water, energy and carbon cycles and hosts 10–15% of land biodiversity (Nobre et al. 2016). The annual cycle of precipitation is dominated by the South American Monsoon System which brings rainfall to the southeast and across Amazonia in austral summer. In austral winter rainfall is largely confined to the northwest whilst southern and eastern areas experience a dry season. Precipitation is the main water source for agriculture, energy production, [especially in the highly populated region of southeast Brazil (Coelho et al. 2016)], and water supply; therefore, considerable changes in the hydrological cycle have the potential for dramatic societal impacts, particularly because daily rainfall extremes have increased resulting in more flooding and landslides (Marengo et al. 2020). Furthermore, rainfall variability is projected to increase, meaning there will be drier or more frequent dry periods and wetter wet periods on daily, weekly, monthly, and intra-seasonal timescales, even in sub-regions where future changes in mean rainfall are currently uncertain (Alves et al. 2021; Chadwick et al. 2022). If we are to reduce that uncertainty, it is important to improve the representation of precipitation and the hydrological cycle in climate models, which is the central aim of this work. It forms part of the climate modelling work package of the Climate Science for Service Partnership (CSSP) Brazil project (Jones 2022), which aims to inform decisions around climate mitigation and adaptation.
The South American climate has been widely discussed in observational and modelling studies with Global and Regional Climate Models (GCMs and RCMs, e.g. Fernandez et al. 2006; Solman et al. 2008, 2013; Alves and Marengo 2010; Solman and Blázquez 2019). ?Whilst?GCMs and RCMs?can reproduce the seasonal and annual mean climatology (Falco et al. 2019), there are some notable biases such as a dry bias over Amazonia, cold and wet biases along the Andes and warm, dry biases in southeastern South America (Solman et al. 2013; Falco et al., 2019;?Gutierrez et al. 2021). Some biases may be related to low grid resolution (100 to 200 km in the case of GCMs, typically around 50 km for RCMs)?which cannot represent the fine-scale details in regions with heterogeneous land surface cover (e.g. as a result of deforestation) or complex topography?(e.g. Ambrizzi et al. 2019). For example, Barros & Doyle (2018) linked?dry biases in southeastern South America in GCMs to weak southward flow to the east of the Andes. This may be related to the maximum height of the topography in the model which is significantly lowered compared to true height by the coarse grid resolution. Moreover, RCMs and GCMs tend not to capture extreme events (Solman & Blázquez, 2019) which contribute significantly to the total precipitation particularly in the subtropics during summer.?In general, GCMs and RCMs are often unable to realistically represent the diurnal cycle frequency and intensity of precipitation (e.g. Prein et al. 2015; Bettolli et al. 2021), which can be related to the convection parametrisation (Prein et al. 2015).
Convection permitting regional climate models (CPRCMs) have been used in weather forecasting for some time but for approximately the last ten years, they have also been applied at climate timescales. CPRCMs are run at high grid resolution (typically less than 5?km) and the parametrisation of convection is switched off to allow the model to explicitly resolve convection, though small-scale convection, especially in terms of sub-grid updrafts, is still under-resolved (e.g. Kendon et al. 2012, 2021). They have?demonstrated a more realistic representation of precipitation statistics, particularly for high-intensity precipitation events and sub-daily extremes (Kendon et al. 2019; Belu?i? et al. 2020), diurnal cycle (e.g. Fosser et al. 2015; Scaff et al. 2019; Lind et al. 2020), spatial structures, storms and particularly mesoscale convective systems (Prein et al. 2017), diurnal temperature range (e.g. Ban et al. 2014; Stratton et al. 2018), tropical-extratropical cloud bands (Hart et al. 2018) and land–atmosphere interactions (Taylor et al. 2013). CPRCM studies at climate timescales have been undertaken for regions such as the UK (Kendon et al. 2012), central Europe (Ban et al. 2014; Fosser et al. 2015), North America (Liu et al. 2017), Africa (Stratton et al. 2018) and the Tibetan Plateau (Li et al. 2021). Due to the large computational resource and the time required to produce these simulations, global convection-permitting simulations are not yet running on climate timescales. In terms of high-resolution and/or CPRCM simulations that include South America, Birch et al. (2015) performed simulations with the Met Office Unified Model (UM) at a resolution that would not normally be considered high enough to be convection-permitting (17?km), nonetheless improvements in the diurnal cycle of rainfall over South America were found. High-resolution simulations (5?km resolution) in southeast Brazil have been performed (Lyra et al. 2018), demonstrating improved representation of precipitation extremes and frequency, though these were not convection-permitting. There have been few studies with convection-permitting models, e.g. the CORDEX-FPS for southeastern South America (Bettolli et al. 2021). This demonstrated improved timing and intensity of precipitation events, though the simulations?cover only a limited area and time period. Convection-permitting models have demonstrated an improved representation of precipitation in particular?for a number of different regions. Therefore, it is likely that their application will be beneficial for Brazil and other regions of South America, especially as convection is a key process controlling heavy precipitation over large areas of South America (at sub-daily timescales) (Solman & Blázquez, 2019).
Here we present results from the first climate-timescale, near continental convection-permitting simulations over South America. The overarching aim of the South America CPRCM experiment (SA-CPRCM), which includes present day and future simulations is to improve the representation of precipitation in climate models, ultimately to help constrain uncertainty in future projections and provide decision-makers with more specific, usable information particularly as precipitation is a key driver of natural disasters. The aims of this manuscript specifically are:
(1)
to describe the details of the present day CPRCM experiments (boundary conditions, parameterisations and configuration) as a reference for potential users of the output data. This information is found is Sect.?2, including the domains, model physics, parametrisations, boundary conditions and forcing,
(2)
to highlight differences, improvements and potentially degradation in performance compared with observations in the CPRCM present-day simulations relative to their RCM/GCM counterparts with parameterised convection. This is intended to guide users towards appropriate applications for the CPRCM output. We focus on key regions of Amazonia (as defined in Alves et al. 2021) and southeast Brazil (as defined in Coelho et al. 2016) that are of greatest interest to the CSSP Brazil project owing to ecological/hydrological significance and population density.
In order to achieve the second aim, we present annual?precipitation and temperature biases for the whole domain using monthly data in Sect. 3. At this timescale we would not necessarily see an advantage in using a CPRCM. We also show diurnal cycles of precipitation and sub-daily precipitation frequency and intensity where we would expect added value from the CPRCM. In the final part of Sect. 3, we highlight an emerging characteristic in relation to the partitioning of evapotranspiration CPRCMs compared with RCMs. Conclusions follow in Sect.?4.
For the SA-CPRCM experiment, we perform three convection-permitting simulations covering the same 10-year period: 1998–2007. The hindcast simulation (CPRCM-ERA), nested in the UK Met Office RCM (MOHC-HadREM3-GA71-25?km), is driven by ERA-Interim reanalysis data (Dee et al. 2011), while the so-called present-day simulation (CPRCM-PD) and a future simulation (CPRCM-2100Footnote 1) are directly nested in UK Met Office GCM (MOHC-HadGEM3-GA7GL7-N512) simulations for present-day and future. We use a one-way nesting strategy with no spectral?nudging. Both RCM and GCM are run at 25?km resolution with parameterised convection according to Gregory and Rowntree (1990) scheme. Given the grid resolution difference between ERA-Interim (~?80?km) and the CPRCM (4.5?km) the intermediate nest provided by the 25?km RCM was needed in order to avoid a very large boundary relaxation zone at the edge of the CPRCM?domain. A 25?km grid resolution for the RCM was found to be more stable than 12?km and required less simulation time.?Another advantage is that is can be more easily compared against the GCM as they have the same grid resolution. The CPRCM simulations use 4.5?km horizontal resolution since it provided favourable results over Africa (Stratton et al. 2018) and over the USA (Prein et al. 2020). Details of the CPRCM, RCM and GCM setup are shown in Table 1. By comparing the results from CPRCM-ERA and CPRCM-PD we are able to identify any biases that are introduced by the use of the GCM for the?lateral boundary conditions (LBCs) as opposed to the reanalysis, whilst comparing CPRCM-PD and CPRCM-2100 simulations will show the effect of climate change equivalent to RCP8.5 at 2100. The experimental setup is similar to that of Stratton et al. (2018) in that land use change and aerosol forcing changes are excluded from the future simulation (CPRCM-2100); only the greenhouse gases and sea surface temperatures (SSTs) are modified. However, in the SA-CPRCM experiment outlined in this study, we have the benefit of a hindcast to better evaluate the simulation of present-day climate. In addition, an updated version of the UM is used, i.e. UM10.6, further details of which are described in Sect.?2.1.
Table 1 Differences between CPRCM, RCM and GCM setups
The CPRCM domain (Fig.?1) was chosen to include Brazil, tropical and sub-tropical South America with sufficient space between the areas of interest (Fig.?1—solid white lines) and the boundary of the domain to allow synoptic-scale features to develop independently within the CPRCM domain whilst minimising the simulation time. The southernmost boundary crosses the Andes at a point of relatively low elevation to minimise the potential for orographically-generated instabilities.
Fig. 1
Terrain elevation (in metres) in RCM domain (indicated by outer boundary of map) and CPRCM domain (indicated by white dashed). White boxes (solid lines) indicate locations of regions used in subsequent analyses: North Amazonia (NAMZ, 5?°S to 5?°N, 70–45?°W), South Amazonia (SAMZ, 12.5–5?°S, 70–45?°W) and southeast Brazil (SEB, 25–15?°S, 55–38?°W). Yellow crosses indicate locations of station data for precipitation from the Large Biosphere Atmosphere (LBA) flux tower data (Harper et al. 2021)
To reduce the time taken to complete the experiments, all CPRCM simulations were run in two 5-years segments (1998–2002, 2003–2007). Each segment was preceded by 1?year of spin-up which was excluded from the analyses, so 10?years of data are available for each of the three experiments. The spin-up periods were initialised from the RCM or GCM atmosphere/land state for the relevant date. We found that 1?year was sufficient for the soil moisture (initialised from the RCM/GCM values) to adjust to the?LBCs (see supplementary information), and we checked time series of key variables (e.g. near-surface air temperature and surface sensible heat flux) to make sure that there are no step changes between the two segments. The RCM simulation starts in 1992 with the soil moisture initialisation state from an offline land surface simulation with JULES (Best et al. 2011) forced with data from the Global Soil Wetness Programme (GSWP). This allowed sufficient time for soil moisture in the RCM to spin-up at lower resolution before beginning the 1-year CPRCM spin-up in 1997. This is not an issue for the GCM simulation that starts in 1988 giving enough time for the soil to spin-up.
The Unified Model configurations used for the GCM and the RCM which use parametrized convection are described in Walters et al. (2019). The CPRCM uses UM version 10.6, and is based on the recent UK configuration produced for UKCP18 (Kendon et al. 2019) with some modifications to match the tropical set-up of the regional atmosphere and land configuration: RAL1-T. The key differences between tropical (RAL1-T) and mid-latitude (RAL1-M) configurations are in the representation of turbulence (i.e. the form of stability functions and the free-atmospheric mixing length), which result in enhanced turbulent mixing in RAL1-T (Bush et al 2020). In addition, there are no stochastic boundary layer perturbations of temperature and moisture in RAL1-T, there is additional vertical resolution in the tropical upper troposphere as opposed to lower boundary layer and the PC2 scheme replaces the Smith scheme. PC2 includes additional prognostic fields that add memory of cloud fields to the system. Tests in tropical and sub-tropical areas show that compared with RAL1-M the changes in RAL1-T tend to result in later initiation of convection, larger and fewer showers, improved location of showers and better representation of stratiform cloud and rain, though total precipitation may be overestimated (Bush et al 2020).The key differences between the CPRCM, RCM and GCM are shown in Table 1 and described in the following sections.
2.1.1 Atmosphere
The CPRCMs, RCM and GCM use the same dynamical core and radiative transfer scheme (SOCRATES) and a modified Wilson and Ballard (1999) microphysics scheme (see Walters et al. 2019; Bush et al. 2020 for more details). The CPRCMs include prognostic graupel, which is not available at coarser resolution. It allows a lightning flash rate prediction scheme (McCaul et al. 2009) to be used. The CPRCMs use a blended boundary layer parametrisation (Boutle et al. 2014) which transitions between the Lock et al. (2000) scheme, (suitable for lower resolution) and the 3D turbulent mixing scheme based on Smagorinsky (1963) suitable for high resolution. The RCM and GCM uses the Lock et al. (2000) scheme with modifications described in Walters et al. (2019).
The CPRCM experiments use the PC2 (prognostic cloud fraction and prognostic condensate) large-scale cloud scheme (Gregory et al. 2002) as opposed to the Smith (1990) scheme used in Stratton et al. (2018) and Kendon et al. (2019). The PC2 scheme outperforms the Smith scheme in climate simulations (Bush et al. 2020).
In the CPRCMs, the convection parametrisation is switched off to allow convection to be explicitly resolved, although at a horizontal resolution of 4.5?km small-scale convection will still not be resolved. In the RCM and GCM the convection scheme is based on the Gregory and Rowntree (1990) mass flux scheme with some modifications (Walters et al. 2019). It includes a diagnosis step, then calls to either the deep or shallow convection schemes and finally a call to the mid-level convection scheme.
2.1.2 Land surface
The CPRCM experiments use nine land cover types including five plant functional types (broadleaf tree, needleleaf tree, C3 grass, C4 grass and shrub) and four non-vegetation tiles, (urban areas, inland water, bare soil and land ice), four soil layers with thicknesses of 0.1, 0.25, 0.65, and 2?m reaching a maximum depth of 3?m, a multilayer snow scheme, a multilayer canopy scheme in which light and nitrogen levels vary though the canopy, and TOPMODEL hydrology (see Walters et al. 2019). TOPMODEL has been used (as opposed to PDM (Probability Distributed Model; Moore 1985) as it can more accurately represent wetlands, which are an important element in South American climate, and has been reported to produce more representative soil moisture in simulations over Europe (Halladay et al. in review.). A difference between the land surface in the GCM compared to the RCM and CPRCM is the use of van Genuchten (1980) as opposed to the simpler Brooks and Corey (1964) soil hydraulic scheme. Tests over Europe have shown that although the different schemes lead to differences in soil moisture, the effects on temperature and precipitation are limited (Berthou et al. 2020).
For the majority of boundary data we have used the default datasets used as standard in the UM configuration described in Table 3 of Walters et al. (2019) with references therein, i.e. leaf area index and canopy height (monthly climatology), ozone (monthly, time-varying), orography (non-time varying), soil parameters (non-time varying). In the cases listed in the following section, we have created ancillary files that are more representative for the SA-CPRCM experiment and its domain.
2.2.1 CO2 and other gases
The hindcast simulation (CPRCM-ERA) and the present-day simulation (CPRCM-PD) are forced by a time-varying array of annual global values of Greenhouse Gas (GHG) concentrations for the simulation period (1998–2007). This implies a uniform spatial distribution of the gases. The level of CO2 increases from 364?ppm in 1998 to 382?ppm in 2007 (Table 3—Appendix). In the future simulation, GHG levels do not vary with time, and values are taken from Coupled Model Intercomparison Project 5 Representative Concentration Pathway 8.5 protocol for the year 2100 (Table 4—Appendix).
2.2.2 Aerosols
The GCM uses the CMIP5 AMIP aerosol emissions dataset and the fully interactive GLOMAP-mode aerosol scheme, which represents the prognostic aerosol species: sulphate, black carbon, organic carbon and sea salt in five variable size modes (Walters et al. 2019). However, this scheme was designed for global models and would have been too computationally expensive to be included in the CPRCM and RCM simulations. Therefore, we opted to prescribe the aerosol optical properties in the CPRCM and the RCM using the so-called “EasyAerosol” approach, which has been widely used in other modelling studies with reduced aerosol complexity (e.g. Fiedler et al. 2019). In order to represent the aerosol forcing, the EasyAerosol scheme requires the following aerosol property fields as inputs at monthly resolution: absorption, asymmetry and extinction for shortwave and longwave and cloud droplet number concentration (CDNC).
Whilst previous UM CPRCM climate simulations (e.g. Kendon et al. 2019) have used the MACv2-SP parameterization (Stevens et al. 2017) to produce the aerosol properties for EasyAerosol, this was not ideally suited to our simulations because (1) the MACv2-SP plumes are highly idealised for our region of interest and (2) it would not be consistent with the GCM driving data. We therefore created the EasyAerosol inputs from the GCM but since the high resolution GCM (N512) simulation that we used to provide the LBCs for CPRCM-PD and CPRCM-2100 had not output the required aerosol fields, we obtained these from an additional lower resolution GCM (N96, approx. 130?km spatial resolution) simulation with the same configuration and physics as the N512 (approx. 25?km spatial resolution) GCM used here. This procedure provided a 12-months repeating climatology of the required fields based on 10-year averages. The annual mean aerosol optical depth climatology has a more realistic spatial distribution compared to the discrete locations of MACv2-SP aerosol plumes from Fig.?1 in Stevens et al. (2017).
One disadvantage of using monthly mean aerosol climatologies instead of interactive aerosol is that the aerosol fields do not respond to the meteorology or vary on daily-hourly timescales. Another consequence of the temporal averaging is that it tends to strengthen cloud radiative effects because the relationship between them and the CDNC is non-linear. This results in net radiation biases at top of atmosphere and at the surface of approximately 1.0 Wm?2. To account for this bias, the CDNC values were scaled by a factor of 0.875 in the RCM and CPRCM simulations.
2.2.3 Land cover data
We use land cover data from European Space Agency Climate Change Initiative (ESA-CCI) for both RCM and CPRCM simulations. It is part of the Met Office UM Regional-Atmosphere configuration (RAL1) (Bush et al. 2020) and subsequent RAL versions and was also used in the CP4-Africa simulations (Stratton et al. 2018). The current default land cover dataset for the UM configuration used for GCM is based on International Geosphere Biosphere Project (IGBP).
The ESA-CCI dataset does not differentiate between grass with the C3 and C4 photosynthetic pathway, however, the UM requires the land cover ancillary file to separate the two grass types. In Stratton et al. (2018), the Still et al. (2003) dataset was used to separate C3 and C4 grass. These data have a lower spatial resolution of 1 degree and are based on an older AVHRR satellite-based product. Therefore, we used the Powell et al. (2012) dataset (hereafter, P12) as it provides C4 grass fraction at?10?km resolution for South America using updated crop type datasets and MODIS vegetation cover fraction products.
To merge the P12 data with the ESA-CCI grass fraction we applied the following procedure: if the P12 data contain C3 or C4 grass in a pixel from CCI with a non-zero grass fraction, then the ratio of C3 to C4 grass from P12 is applied to the total grass fraction (i.e., C3-grass?+?C4-grass) from the ESA-CCI dataset. However, if the corresponding P12 pixel does not contain any grass in a CCI grass pixel, then the mean C3:C4 ratio in the surrounding pixels from P12 is applied. To find the ratio in surrounding pixels, the algorithm searches 1 pixel away, then 2 pixels away etc. until a non-masked value is found. This method conserves the total grass fraction from the ESA-CCI data, and therefore does not adjust other land-cover tile fractions to accommodate changes in grass fraction. Adjusted C3 and C4 fractions are shown in Fig.?2.
Fig. 2
Land cover fractions for each surface type in CPRCM domain including adjusted C3 and C4 grass derived from ESA-CCI total grass fraction and C3 to C4 ratios from Powell et al., (2012)
We use prescribed time-varying daily SSTs obtained from the Reynolds et al. (2007) data set at 0.25° resolution to be consistent with the GCM and in common with Stratton et al. (2018) and Kendon et al. (2019). This is a blended dataset combining information from satellites, ships and buoys. For the future climate simulation, we added the monthly climatological SST change between the present (1975–2005) and future (2085–2115) in HadGEM2-ES RCP 8.5 simulations to the daily Reynolds forcing, which is similar to the method described in Mizielinski et al. (2014). For consistency, the same procedure was applied to sea ice, although there is no sea ice within the model domain.
For the air temperature evaluation, we use two gridded observational datasets, both included in IPCC AR6, in order to provide an indication of observational uncertainty. These are: (1) CRU TS4 (Climate Research Unit gridded Time Series, Harris et al. 2020), which includes monthly weather station observations interpolated to 0.5?×?0.5 degree grid using angular distance weighting and (2) Berkeley-Earth (Rohde and Hausfather 2020) that reprocesses Global Historical Climatology Network (GHCN) monthly temperature data using a new framework that also allows short records from alternative sources to be incorporated alongside GHCN. It is interpolated using kriging to 1?×?1 degree grid.
For the evaluation of the precipitation field, we use three merged satellite-gauge data products: (1) Tropical Rainfall Measuring Mission v7 (TRMM-3B42,?Huffman et al. 2007)?that covers the whole period (1998–2007) of present-day simulations with a 3-hourly temporal resolution and a spatial resolution of 0.25 degrees, (2) Integrated Multi-satellite Retrievals for Global Precipitation Measurement: (GPM-IMERG, Huffman et al. 2019), which has a spatial resolution of 0.1 degrees and a temporal resolution of 30?min but starts in 2000, and (3) Climate Hazards Group InfraRed Precipitation with Station (CHIRPS, Funk et al. 2015) with a daily temporal resolution and a very high spatial resolution of 0.05 degrees. CHIRPS uses cold cloud duration observations which are calibrated using TRMM and blended with station data. Note that for gauge-sparse areas such as western Amazonia, these datasets are more reliant on satellite data than gauge data, which increases the uncertainty in this region. In addition, we used precipitation data from the Large-Scale Biosphere Atmosphere (LBA) experiment as used in Harper et al. (2021). These data are based on flux-tower observations from four sites in Amazonia (Fig.?1) covering a period of between 2 and 5?years depending on location (Table 2—Appendix). The observations have been quality-controlled and gap-filled using the procedure outlined in Harper et al. (2021).
For evapotranspiration (ET) evaluation, we use one reanalysis product, i.e. ERA5 (Hersbach et al. 2020) and the three monthly satellite-based datasets used in Baker et al. (2021). The first is the Global Land Evaporation Amsterdam Model (GLEAMv3.5, Martinez et al. 2016), which uses a set of algorithms based on the Priestley-Taylor framework to estimate evapotranspiration and its components with inputs from reanalysis (for meteorological data) and satellite observations of leaf area index. It has a spatial resolution of 0.25 degrees. The second is the Moderate Resolution Imaging Spectroradiometer (MODIS MOD16, Mu?et al. 2011) which has a spatial resolution of 0.05 degrees and is based on the Penman–Monteith equation (Monteith 1965). The dataset uses albedo, fraction of absorbed photosynthetically active radiation (FPAR), land cover and leaf-area index (LAI) from MODIS, combined with daily meteorological reanalysis data from NASA Global Modelling and Assimilation Office (GMAO v 4.0.0, 2004) to produce an ET estimate. The last dataset is the Process-based Land Surface ET/Heat Fluxes algorithm (P-LSH, Zhang et al. 2010), which is also based on the Penman–Monteith equation but uses different inputs to MODIS and GLEAM though all are satellite-based. It has a spatial resolution of 0.08 degrees.
Abstract
1 Introduction
2 Experimental setup and methods
3 Results and discussion
4 Conclusions
Availability of data
Notes
References
Acknowledgements
Author information
Ethics declarations
Appendix
#####
In this section we compare the annual mean precipitation and near surface air temperature from the different simulations (CPRCM, RCM and GCM) with the observational products for the evaluation period 1998–2007. The data have been regridded to the resolution of the coarsest dataset in each case, be that model or observations, using a conservative method.
The CPRCM-PD shows a larger wet bias than CPRCM-ERA over western Amazonia (Fig.?3d and b), which is likely to originate from the driving data since there is a larger overall wet bias in the GCM (root mean squared error (RMSE)?=?2.32?mm/day) relative to the RCM (RMSE?=?1.93?mm/day). This can also be seen in Fig.?3g which shows the impact of the LBCs from reanalysis data versus the GCM; CPRCM-ERA is drier than CPRCM-PD over most areas. Note that there?is some uncertainty in the observational estimates if we compare the maps showing CHIRPS and TRMM (Fig.?3f), although the uncertainty is less than 1?mm/day in most areas and less than the model biases.
Fig. 3
Top row: Annual mean precipitation climatology (mm/day) (1998–2007) for TRMMv7, bias compared with CPRCM-ERA, RCM, CPRCM-PD and GCM including root mean squared error for the whole domain. Bottom row: difference between CHIRPS and TRMMv7, difference between CPRCM-ERA and CPRCM-PD (showing the impact of LBCs), CPRCM-ERA minus RCM and CPRCM-PD minus GCM (showing the impact of downscaling). Key regions from Fig. 1 are marked in black
Both CPRCM-ERA and CPRCM-PD have a larger dry bias on the northeast coast of South America compared to the RCM and GCM (Fig.?3b–d), which is also seen in preliminary results from similar simulations with Weather Research and Forecasting (WRF) model (SAAG 2022) and in RCM simulations (e.g. Solman et al. 2013; Solman and Blasquez 2019). This can also be seen in Fig.?3h, i as the difference between CPRCMs and their driving models; both CPRCMs are drier along the northeast coast by approximately 4?mm/day. This could be linked the limited area of the CPRCM domain that may prevent full development of weather systems between the northern and eastern boundaries and the coast of South America, although further investigation would be required to confirm this hypothesis. RMSEs for individual regions (Appendix—Table 5) show decreased RMSE in the CPRCMs compared to their driving models for SAMZ and SEB (e.g. 1.74–1.23?mm/day from GCM to CPRCM-PD in SEB), however for NAMZ the dry bias in the northeast and wet bias over the Guyana Highlands in the CPRCMs contribute to a large increase in RMSE.
All models in Fig.?3 have a wet bias along the Andes. A wet bias in the Andes is also seen in the WRF simulations (SAAG 2022) and earlier RCM simulations e.g. (Fernandez et al. 2006; Solman et al. 2008; Alves and Marengo 2010, Solman et al. 2013; Falco et al. 2019). Beck et al. (2020) shows that many gridded precipitation products underestimate precipitation over mountainous areas for a variety of reasons including gauge undercatch, presence of snow and ice (in relation to satellite retrievals) and inadequate spatial resolution for capturing orographic precipitation and the associated circulation.
Compared to their driving models, CPRCMs show a reduction of the cold bias for daily maximum temperature (Tmax) (Fig.?4c–f) in Amazonia and southeast Brazil but an increase in the warm bias in subtropical South America and over the northeast coast. Regional average RMSEs (Appendix—Table 6) show that daily Tmax biases are?generally worse in the CPRCMs except for SAMZ and SEB where RMSE is substantially reduced in CPRCM-PD compared to the GCM. The warm bias in the northeast (of approximately 3?K) is likely to be related to the dry precipitation bias over the same area or to the representation of wind patterns and their role in moisture transport. Lucas-Picher et al. (2021) note that if the representation of precipitation in a CPRCM is improved, temperature biases can be reduced as a result. Cold biases persist in all models in regions of high topography. However,?there is some observational uncertainty in Tmax shown by the difference between Berkeley-Earth and CRU, particularly in the central Andes where the quality of observations can be less reliable (Solman et al. 2013).
Fig. 4
Annual mean daily near surface air temperature, daily maximum (top row), daily minimum (middle row) and daily mean climatology (lower row) (K), for 1998 to 2007 in CRUTS4 and difference plots for Berkeley Earth, CPRCM-ERA, RCM, CPRCM-PD and GCM minus CRUTS4 including root mean squared error for the whole domain. Key regions from Fig. 1 are marked in black
The widespread warm bias in daily minimum temperature (Tmin) which can be as much as 4?K is a striking feature across all models (Fig.?4i–l), suggesting that it is largely unaffected by the change from a parametrised to explicit representation of deep convection. The pattern is not greatly affected by the choice of observational data nor the LBCs, though RMSE is lower in the CPRCMs (Appendix—Table 6). Overestimation of minimum temperatures has been linked to overestimation of wet day frequency (Solman et al. 2008), however, precipitation frequency in the CPRCM tends to be underestimated (Sect.?3.3). A possible explanation could be too much cloud cover at night, limiting outgoing longwave radiation and increasing minimum temperatures, though this would require further investigation. We note there is a reduced cold bias in Tmin in CPRCM-ERA and CPRCM-PD in the arid region close to the Pacific coast, compared to the RCM and GCM (Fig.?4i, k).
For mean near surface temperature (Tmean), the biases in CPRCMs appear larger than in the GCM (Fig.?4o–r), however, RMSE values (Appendix—Table 6) show that the CPRCMs reduce Tmean biases in SEB compared to their driving models. There is some observational uncertainty in Tmean between Berkeley-Earth and CRUTS4 though these tend not to be in the regions with greatest biases such as the northeast and subtropical South America. The warm bias of approximately 3 or 4?K in CPRCM-ERA in the subtropics east of the Andes, also seen in Tmax, was shown by Solman (2016) to be systematic across many models and increase at higher temperatures. Solman (2016) linked?the bias to a poor representation of land surface processes. It may potentially?be reduced by the use of a groundwater scheme as in Martinez et al. (2016)?or Barlage et al. (2021).
CPRCM-ERA performs better than CPRCM-PD, as would be expected from a hindcast versus a GCM-driven simulation. RMSE values (Appendix—Table 6) are higher in CPRCM-ERA which may be inherited from the?higher RMSEs in the driving model. This may not have been the case if the CPRCM had not been nested in the RCM and CPRCM-ERA had been directly downscaled from reanalysis.
In line with earlier CPRCM studies (e.g. Birch et al. 2015; Prein et al. 2015; Stratton et al. 2018), we find that the CPRCM improves the representation of the diurnal cycle of rainfall (Fig.?5) in that the peak occurs several hours later (approx. 18–21 UTC) across many areas of South America. In the RCM, it occurs at around 12–15 UTC, whereas in TRMM it occurs at around 21 UTC (1600–1800 local time, depending on region). The earlier peak in the RCM with parametrised convection is caused by the scheme responding too quickly to instabilities and not having memory of instabilities from one timestep to the next (Kendon et al. 2012). In the CPRCM, instability can accumulate during the day and is then?released later in the afternoon (Lucas-Picher et al. 2021), which leads to a much better simulation of the timing of the most intense convection when compared to the observations. In some areas, both CPRCM-ERA and RCM are able to reproduce the peak at similar time as TRMM when it occurs between 0 and 9 UTC, e.g. in the south of the domain east of the Andes and near the northeast coast. This may because these rainfall peaks are generated by different processes and are therefore less affected by the representation of convection in the model.
Fig. 5
Time of day coinciding with peak in diurnal cycle of rainfall in TRMMv7, 25?km RCM and 4.5?km CPRCM-ERA (1998–2007) for Dec-Jan-Feb using 3-hourly data. SAMZ region used in Fig.?6 is marked in black
The precipitation peak around the middle of the day in the RCM is visible as regular horizontal green-yellow stripes in Fig.?6b. In Fig.?6, rainfall is averaged across a latitude band in southern Amazonia at the peak of the wet season. It shows that the CPRCM (Fig.?6c) produces more realistic spatial structures such as squall lines that generally propagate from east to west with time. Arguably there are too many dry spells in the CPRCM (dark blue areas) and the RCM intensity tends to be too low compared with GPM-IMERG (Fig.?6a). The more frequent dry spells and smaller areas of precipitation in the CPRCM compared to the RCM are characteristic of the switch from parametrised to explicit convection (e.g. Stratton et al. 2018; Kendon et al. 2019; Berthou et al 2020). Although this is a single month in the wet season, chosen as it is the first January for which GPM-IMERG is available, it is a good illustration of the differences in spatial structure of precipitation between the CPRCM and RCM. Moving southwards to other latitude bands and towards the subtropics (15–25?°S, 25–35?°S) (not shown), we find little difference in spatial structure between RCM and CPRCM, which may be related to a dominance of larger-scale rainfall generating processes in this region that are less affected by the model’s representation of convection.
Fig. 6
Hovm?ller plot (longitude-time) for January 2001 comparing hourly rainfall from GPM-IMERG, RCM and CPRCM-ERA averaged across a latitude band from 12.5 to 5 °S (SAMZ region)
We have calculated mean precipitation frequency in CPRCM-ERA, RCM and TRMM, that is the proportion of 3-hourly periods that are wet (>?0.1?mm/h) for DJF (peak wet season for southern Amazonia and SE Brazil). The CPRCM-ERA underestimates frequency in northwestern Amazonia by around 0.1, but in the Andes and the coast of SE Brazil there is a larger overestimation (Fig.?7c). By contrast, the RCM overestimates by over 0.2 in much of the domain including the Andes (Fig.?7b). Other seasons show similar results in terms of the magnitude and patterns of over and underestimation in RCM and CPRCM-ERA, and LBA observations from Amazonia at 3-hourly resolution are in agreement with TRMM in terms of precipitation frequency (Table 8—Appendix). We find that CPRCM has improved the frequency and intensity of precipitation compared with the RCM and relative to TRMM. The improved representation of precipitation frequency in CPRCM-ERA is consistent with other studies, e.g. Kendon et al.(2019), Berthou et al. (2019) and Lucas-Picher et al. (2021).
Fig. 7
Mean DJF frequency (1998–2007) of precipitation?>?0.1?mm/h in a 3-hourly period in a TRMM, b RCM minus TRMM and c CPRCM-ERA minus TRMM, mean intensity of precipitation when?>?0.1?mm/h in a 3-hourly period in d TRMM, e RCM minus TRMM and f CPRCM-ERA minus TRMM, mean intensity of 99th percentile of precipitation for each gridbox when?>?0.1?mm/h in a 3-hourly period in g TRMM, h TRMM minus RCM and i TRMM minus CPRCM-ERA. All data are regridded to the same resolution as the TRMM data using an area-weighted conservative method. Locations of subregions used in subsequent analysis are marked by black boxes and LBA observations sites are marked with crosses
For precipitation intensity, we calculated the mean intensity for wet 3-hourly periods, i.e. periods with a mean precipitation rate of?>?0.1?mm/hr (Fig.?7d–f). Then we calculated 99th percentile intensity for the same wet 3-hourly periods (Fig.?7g–i). As for precipitation frequency, only DJF has been included here, as the results were similar for other seasons. This shows that compared with TRMM, CPRCM-ERA overestimates mean intensity by up to 1?mm/h over Amazonia (Fig.?7f), but underestimates by a similar amount in the subtropics and northern coastal areas. The RCM, however, underestimates mean intensity in almost all areas, and the magnitude of underestimation is greater than in CPRCM-ERA in the subtropics and northern coasts. The results for 99th percentile of wet 3-hourly mean intensity show a similar pattern to that of mean intensity. As found with frequency, the LBA observations at 3-hourly broadly agree with TRMM in terms of mean intensity and 99th percentile intensity (Table 8—Appendix). However, the number of stations and length of observational record is limited, so these conclusions regarding the representativeness of TRMM in Amazonia are tentative. In CPRCM simulations for Africa (Kendon et al. 2019), there were clear improvements in the mean intensity and 99th percentile intensity of rainfall compared with TRMM at 3-hourly resolution. In the CPRCM-SA there appears to be less benefit over the tropical forests Amazonia compared to over Africa, which is an area for further investigation. However, for the more populated southeast Brazil biases are decreased, particularly for mean intensity.
Both RCM and CPRCM-ERA underestimate mean intensity and 99th percentile intensity in the subtropics to the east of the Andes. This area benefitted from the addition of a groundwater scheme in Martinez et al. (2016), which increased latent heat flux and precipitation. This scheme includes a representation of a shallow aquifer that can interact with the soil column, allowing additional upward moisture transport during dry periods. Other modelling studies such as Christoffersen et al. (2014) and Fan & Miguez-Macho (2010) have highlighted the role of groundwater in maintaining ET during dry periods in parts of South America and that models that did not represent this process may not realistically simulate dry season ET. Furthermore, high-resolution CPRCMs have been shown to be more sensitive than lower resolution models to the inclusion of lateral flows and interactions with rivers (Barlage et al. 2021) Therefore, we suggest that a groundwater scheme may to some extent address biases that are found in the mean precipitation (Fig.?3), mean temperature (Fig.?4) and mean precipitation intensity in CPRCM-ERA.
3.4.1 3-hourly regional distributions
We have compared the rainfall intensity distributions using 3-hourly data from the GCM, RCM, CPRCM-ERA and CPRCM-PD with that of TRMM for southeast Brazil (Fig.?8) and Amazonia (Fig.?9). The Amazonia region used here is an aggregation of NAMZ and SAMZ (Fig.?1). The model data has been regridded to the same grid as TRMM (0.25?×?0.25 degree) and distributions are calculated using the Analysing Scales of Precipitation (ASoP) method (Klingaman et al. 2017), which is explained in detail in Berthou et al. (2020). The rainfall from each grid point in the specified region and 3-hourly time period in a given season over the 10-years simulation period is binned by intensity with bin rate indicated on the x-axis. As in Berthou et al. (2020), the bins are unequally distributed in logarithmic space (see Eq.?1 in Berthou et al. 2020) so there are similar numbers of events in each bin, making the resulting plot more readable. The y-axis measures the contribution to the mean rainfall rate from each of these bins and the area under the curve is equal to the mean precipitation.
Fig. 8
Seasonal distribution of 3-hourly precipitation 1998—2007 for southeast Brazil southeast BRAZIL region (SEB in Fig.?1, land only) in CPRCM-ERA, CPRCM-PD, RCM and GCM compared with observations from TRMMv7 (Huffman et al. 2007)
Fig. 9
As Fig.?8 but for Amazonia region (NAMZ?+?SAMZ in Fig.?1)
For southeast Brazil (Fig.?8), we can clearly see the seasonal cycle from wet (DJF) to dry (JJA) and transition seasons (MAM and SON) is well-simulated in all models. Secondly, the distributions from the two CPRCMs are clearly a better match for the TRMM data compared with the GCM and RCM, at least in DJF, MAM and SON. In DJF there is general tendency for CPRCMs to overestimate the contribution from heavy rainfall (approx. 10–80?mm/3?h) as noted by Kendon et al. (2021). The RCM and GCM both show excess contribution from light rainfall (from 0.1 to 10?mm/3?h) especially in DJF, MAM and SON, which is consistent with the negative intensity bias in the RCM in Fig.?7. This is commonly reported in models with parametrised convection (e.g. Kendon et al. 2012; Solman and Blázquez 2019; Berthou et al. 2020; Lucas-Picher et al. 2021). There is little difference in intensity characteristics (shown by the similar distributions) between CPRCM-ERA and CPRCM-PD, but they differ more in amplitude especially in JJA and SON, which suggests a greater influence from the LBCs in those seasons.
The intensity distributions for Amazonia (Fig.?9) show less agreement between TRMM and the CPRCMs than those for southeast Brazil. The CPRCMs tend to underestimate the contribution from rainfall intensities between 1 and 10?mm/3?h and overestimate the contribution from heavy rainfall, and the opposite is true for the RCM and GCM. This has been seen with other convection-permitting models (Prein et al. 2015; Kendon et al. 2021 and references therein) and may indicate that the spatial resolution needs to be higher to better resolve convection in this region.
3.4.2 Hourly local distributions
It has been suggested that gridded observations often underestimate rainfall intensity (Freitas et al. 2020; Lucas-Picher et al. 2021). Therefore, we plotted the intensity distributions of hourly data for Amazonia from RCM and CPRCM-ERA alongside GPM-IMERG data, available at 30-min temporal resolution and point observations from LBA. We chose locations with the longest time series of data for southern and northern Amazonia (Figs.?10, 11). GPM-IMERG data were aggregated to hourly resolution to enable comparison with hourly model output and point observations. The exact periods covered by the point data are shown in Table 2—Appendix. We compared the point observations with the nearest point from GPM-IMERG at its native resolution (approx. 10?km), the nearest point and surrounding nine points after regridding to RCM resolution (25?km), the nearest point, with and without the surrounding points of the CPRCM-ERA at its native resolution (4.5?km), the nearest point from CPRCM-ERA and surrounding points after regridding to 25?km and the nearest point and surrounding points from the RCM at native resolution. All rainfall distributions in Figs.?10 and 11 are smoothed using a 25-point moving window as without smoothing, distributions for a single point or small number of points are very noisy which obscures the shape of the distribution. Smoothing was not necessary for Figs.?8 and 9 as the data are derived from a large number of points so that the spatial variability is averaged out.
Fig. 10
Seasonal distribution of hourly rainfall (2001–2005 see Table 2—Appendix for exact dates) for LBA-K77 site in northern Amazonia (see Fig.?1) compared with CPRCM-ERA (nearest point and nearest 9 points at native resolution, and nearest 9 points after regridding to RCM resolution), the nearest point from GPM-IMERG at native resolution and the nearest 9 points after regridding to RCM resolution
Fig. 11
As Fig.?10 but for LBA-RJA site shown in Fig.?1 and for 2000–2002—see Table 2—Appendix for exact dates)
For the LBA-K77 site in northern Amazonia (Fig.?10) there is a large difference between GPM-IMERG (nearest point and nearest nine points at RCM resolution) and the point observations. We find that CPRCM-ERA at native resolution (blue dashed line) and regridded to the RCM resolution (solid blue line) both show a greater contribution from heavy rainfall (>?50?mm/h) and a smaller contribution from lighter rainfall (50?mm/h) in all seasons compared to GPM-IMERG (yellow lines). The point observations (black dashed lines) are most similar in distribution to the CPRCM regridded to the RCM resolution especially in MAM and SON. CPRCM-ERA at native resolution (dashed and dotted blue lines) have a distribution that is shifted to the right indicating more heavy and less light rainfall, because extreme values are not averaged out across a larger area.
For the LBA-RJA site in southern Amazonia (Fig.?11), we again see that there are large differences between point observations and GPM-IMERG data. CPRCM-ERA (regridded to RCM) is more similar to point observations than the GPM-IMERG data, which has an intensity distribution that is intermediate between the RCM and CPRCM-ERA. However, in contrast with the LBA-K77 site, we find that at higher intensities (around 100?mm/h) the point data falls between CPRCM-ERA at native resolution and the CPRCM-ERA at RCM resolution. Including point data in the analysis has shown that the CPRCM should not only be evaluated against a coarser resolution gridded dataset that would not be expected to capture the more extreme events; point data with sub-daily and ideally hourly temporal resolution should be included where possible as it can capture the more localised, higher intensity events.
An effect of the shift in rainfall intensity distribution to greater contributions from higher intensities when changing from convection-parametrised to convection-permitting models (Sect.?3.4), is to alter the partitioning of ET between canopy (Ec) and soil evaporation (Es) (Folwell et al. 2022; Halladay et al. in?review). The heavier rainfall is associated with lower total ET and can have an impact on other parts of the hydrological cycle, e.g. runoff (Folwell et al. 2022). In the UM, soil evaporation consists of transpiration and bare soil evaporation, which both source moisture from soil, whereas canopy evaporation is the evaporation of water droplets from the surface of vegetation. It is controlled by the amount of water in the canopy store (canopy water), which is mainly replenished by rainfall. However, heavy rainfall will tend to quickly exceed the canopy storage capacity, reach the ground as throughfall and thus it is no longer available for evaporation. So, even if the mean annual rainfall in two models is similar, the amount reaching the ground versus that amount stored in vegetation will vary according to the intensity distribution. This is reflected in the amount of Ec compared to Es and canopy water content in CPRCM-ERA and RCM (Fig.?12).
Fig. 12
Annual mean ET in CPRCM-ERA hindcast and RCM (top row), soil evaporation (second row), canopy evaporation (third row), canopy water content (fourth row) and soil evaporation to ET ratio, compared with observational estimates from GLEAM, ERA5, MODIS and Priestley-Taylor datasets where relevant variables are available
There is a question as to whether the lower ET and Ec values in CPRCMs are more realistic, as there is considerable uncertainty in ET between the various gridded datasets that are available (Fig.?12; Sorensen and Ruscica 2018; Baker et al. 2021). The same is true for the different estimates of Es, Ec and Es/ET which could be considered to agree with either the CPRCM or the RCM.
The differences in partitioning also affect the soil evaporation/ET ratio which in turn affects the soil-atmosphere coupling. Lower Es/ET ratios as in the RCM (Fig.?12) (i.e. higher canopy evaporation) mean that more evaporation comes from the smaller canopy store which is quickly filled and emptied and less comes from the larger soil moisture store which fluctuates on longer timescales, as described in Dong et al. (2022). A higher Es/ET ratio as in the CPRCM, implies greater soil moisture-atmosphere coupling which affects other components of the hydrological cycle. For example, in convection-permitting simulations for Africa, Folwell et al (2022) showed that if more water reaches the soil moisture store, ET may be maintained for longer though periods with low rainfall.
ccDownload:/content/pdf/10.1007/s00382-023-06853-0.pdf