Crop yield estimation
UCL Geography’s Sentinels of Wheat project combines field surveys, satellite data, and advanced modelling to monitor crop yields in China’s North China Plain, supporting food security research.
Collaborators
This work builds on core support from NERC NCEO, ESA, the EU framework 2020 MULTIPLY programme, and is directly funded by the Newton Fund, through STFC and NSFC. The work arises from a collaboration between UCL/NCEO, Assimila, CAAS, CAU, BNU, Peking University.
Earth Observation data
We have designed the monitoring system to be robust to sampling opportunities. This means that it can make use of any satellite data at suitable wavelengths and spatial resolutions. This includes optical data from the US Landsat missions and data from the Chinese Gaofen satellites, but the backbone is provided by data from the operational EU/ESA Sentinel-2 satellites. These are part of the EU-funded Copernicus programme, and use multiple satellites to provide global observations at better than the weekly frequency at a spatial resolution of 10+ m. Such optical data (i.e. from around 400-2500 nm wavelength) is ideal for agricultural monitoring over much of the world, including both large farms and smallholders.
Radiative transfer models
Optical sensors on these satellites do not directly measure information that tells us about crop state. Instead, they measure sunlight, reflected from the Earth’s surface and transmitted through the atmosphere, except when clouds ‘get in the way, when they measure the cloud scattering, rather than that of the land surface.
To monitor the land surface, we must clear imagery of artefacts such as clouds, cloud shadows, dead pixels, etc., and then try to estimate the land surface reflectance from our measurements of ‘top of atmosphere’ radiance. We then want to make sure that we are looking at the image pixels we are interested in (crop pixels, in this case).
We next wish to interpret the land surface reflectance in terms of the biophysical (structural and biochemical) properties that control crop reflectance. These can be broadly characterised as the amount of vegetation (given as the Leaf Area Index, LAI), the leaf properties (typically chlorophyll, dry matter and water content) and the soil properties.
To deal with atmospheric effects and interpret the surface reflectance, we can use specialised radiative transfer models that tell us how varying the properties leads to variation in spectral reflectance. That is tricky enough, but harder still in many ways, is ‘inverting’ such a model, to give the mapping from spectral reflectance to e.g. vegetation properties. In recent years, fast approaches to provide estimates of this have been developed using machine learning.
Microwave data
Optical data sources such as those above allow estimates of crop biophysical state to be inferred from the spectral reflectance, as we have seen. But, a lot of the time clouds get in the way of seeing the land surface from space, so we also make use of observations from the microwave region, here, Sentinel-1 data. The C-band instrument on S-1 sends down a radar pulse, and measures how much radiation is scattered back to the sensor. Special processing synthesises a long antenna, allowing for high-resolution observations to be made. The backscatter data are sensitive to vegetation amount, moisture, and soil properties, but not (most) cloud. A backscatter ratio from two polarisations is found to provide quite a stable signal for tracking vegetation amount. So, even though the data are noisier and of lower information content than optical data, they fulfil the vital role of observing the surface in the presence of cloud.
Crop models
Mechanistic models of crop development, such as WOFOST, allow us to link relatively simple parameterisations to crop growth, phenology and yield, driven by weather data.
If we had a well-calibrated model and weather data, expressing the conditions local to a particular part of a particular field, we might expect the model to perform well, and allow for good tracking of crop status and any stresses, and good predictions of crop yield. Model calibration, however, requires quite a large set of agronomic data, so these models are only ever calibrated in a broad sense, for a particular crop with a particular set of practices.
A regionally-calibrated agronomic model with somewhat localised weather data is of great use in management and planning for farmers, regional and national authorities, as well as insurers, etc. But it doesn’t directly relate to what is happening in a particular area of a particular field. Even though they will be tied to the weather in the year simulated and so show the right broad seasonal effects, different local conditions could give rise to a range of different scenarios of LAI development. This is illustrated in the figure below, which shows an ensemble of plausible LAI trajectories, each of which corresponds to a slightly different set of model parameters and/or weather conditions that are likely representative of those found on the ground.
Most often in the past, crop model calibration has been used to produce a set of model parameters, i.e. the parameters that ‘best’ represent the range of agronomy data used in the calibration (typically with variations over space and time). That concept of calibration tends to ignore the fact that, taking into account the uncertainties in the data and model, we should more properly represent the calibration result by a set of statistical distributions of model parameters: some representation of their probability distribution functions (PDFs). We can provide better calibrations of the crop models then, by taking into account uncertainty. This gives us the ability to estimate output uncertainty.
Data Assimilation
We can provide some improvement on using regional models, with associated calibrated PDFs, by using coarse spatial resolution data from Earth Observation, and we have learned how to use data assimilation techniques to combine model and observational information. But the agricultural landscape in many countries varies at a higher resolution than this, so complicated ‘scaling effects’ must be taken into account in trying to use, e.g. 1 km observations over fields that are smaller than this. The advantage of the coarse resolution data has been that it has a high temporal frequency, typically giving observations every day or so.
We have used this combination of data and models to improve regional yield estimates in China by providing information through the Government CHARMS system. In addition, when we have only a partial set of observations and some predictions of the likely weather, we can make predictions of crop yield while the crop is still in the ground. As time advances and we get more information, we can refine and improve these predictions. We have also provided this sort of information for the regional government in China.
Thanks to Copernicus, we now have satellite observations at the right spatial and temporal frequency for agricultural monitoring.
The figure below illustrates how the regionally-calibrated crop models, with somewhat coarse resolution weather data, provide a set of probable crop states (LAI in this case). The satellite data then provide spatially-localised ‘measurements’ of the actual crop state (LAI here). The combination of the data and model information then refines the model calibration and gives spatially localised estimates of yield, and a probability distribution function - PDF expressing the uncertainty of the yield estimate. The data model ‘merging’ is down using data assimilation.
Results
The use of the system in forecasting is shown below. The lector’s panel shows plausible LAI trajectories over time. The red dots illustrate Sentinel-2 LAI estimates (with uncertainty bar). The right panel shows the PDF of the yield. Early in the season, the yield could take on a wide range of states. As we get more observations, the ensemble of possible states shrinks, and the uncertainty in yield prediction also shrinks. This would be further improved if observed weather were used to replace expected weather. Around a month before harvest, the yield estimate is quite stable, but even before then, we are limiting the range of expected yield considerably.
There are various outputs of the system, but one of the most general interest to users is a spatial map of yield (at 10m or so spatial resolution).
We can visualise this at a range of spatial scales to show the level of detail obtained. The images below show the estimated yield (kg/ha) of winter wheat in 2017 over Hengshui in the North China Plain. We can visualise the data at different spatial resolutions. At the broad scale, we see a North-West to South-East gradient in crop productivity. This is driven by weather and soil. This regional information is needed mainly for reporting and planning.
As we zoom in, we can now see the full resolution of the Sentinel data coming into play. We can see individual fields: some performing better than others due to very local conditions and farmer decisions on variety, fertiliser inputs, irrigation, etc. We can also see within-field variations from these data, showing areas of higher- or lower productivity. This is the sort of detailed information needed for local management and insurance.
We have developed a capability to produce similar data in almost real-time using our models on the Google Earth Engine.
Got questions? Get in touch.
Contact us if you have any questions about studying Geography at UCL.