Enhancing smallholder wheat yield prediction through sensor fusion and phenology with machine learning and deep learning methods
Field-scale prediction methods that use remote sensing are significant in many global projects; however, the existing methods have several limitations. In particular, the characteristics of smallholder systems pose a unique challenge in the development of reliable prediction methods. Therefore, in this study, a fast and reproducible new approach to wheat prediction is developed by combining predictors derived from optical (Sentinel-2) and radar (Sentinel-1) sensors using a diverse set of machine learning and deep learning methods under a small dataset domain. This study takes place in the wheat belt region of Ethiopia and evaluates forty-two predictors that represent the major vegetation index categories of green, water, chlorophyll, dry biomass, and VH polarization SAR indices. The study also applies field-collected agronomic data from 165 farm fields for training and validation. According to results, compared to other methods, a combined automated machine learning (AutoML) approach with a generalized linear model (GLM) showed higher performance. AutoML, which reduces training time, delivered ten influential parameters. For the combined approach, the mean RMSE of wheat yield was from 0.84 to 0.98 ton/ha using ten predictors from the test dataset, achieving a 99% confidence interval. It also showed a correlation coefficient as high as 0.69 between the estimated yield and measured yield, and it was less sensitive to the small datasets used for model training and validation. A deep neural network with three hidden layers using the ten influential parameters was the second model. For this model, the mean RMSE of wheat yield was between 1.31 and 1.36 ton/ha on the test dataset, achieving a 99% confidence interval. This model used 55 neurons with respective values of 0.1, 0.5, and 1 × 10−4 for the hidden dropout ratio, input dropout ratio, and l2 regularization. The approaches implemented in this study are fast and reproducible and beneficial to predict yield at scale. These approaches could be adapted to predict grain yields of other cereal crops grown under smallholder systems in similar global production systems.