Combining legacy data from heterogeneous crop trials to identify genotype by environment interactions using model-based recursive partitioning
Crop variety trials are important to generate insights on variety environmental adaptation, but this requires that varieties should be tested in a wide range of environments to consider the complexity of genotype by environment interactions. Given the substantial costs of collecting trial data, agricultural science needs to maximize the insights extracted from existing data. An alternative is to combine data from different trials performed in different environments using a data synthesis approach. Analyzing aggregated data from different trials could be challenging as datasets are often heterogeneous. Previous research has shown that ranking-based methods can deal with heterogeneous data from different trials to gain insights in average performance of genotypes, but not in responses to different environmental conditions. We show that such insights can be obtained from heterogeneous legacy field trial data by means of model-based recursive partitioning, using climatic covariates from open access databases. We applied this strategy to analyze the reaction of different banana cultivars to black leaf streak disease across several environments. This data-driven approach allowed to integrate heterogeneous datasets, which differ in measurements scales, experimental design, and testing environments. In our preliminary results, we found that cultivar reaction to black leaf streak disease is driven by both genotypic and climatic factors. The main agroclimatic variables identified by our model are the diurnal temperature range (DTR) and maximum length of consecutive days with rain >= 1 mm (MLWS). We show the potential of this method, which allows to gain cumulative insights in genotype by environment interactions as more trial data becomes available.
Brown, David; Carpentier, Sebastien; de Bruin, Sytze; de Sousa, Kauê; Machida, Lewis; van Etten, Jacob.