The CGIAR Platform for Big Data in Agriculture (BIG DATA), in collaboration with data and information specialists, researchers, and others across all CGIAR Centers, built strong momentum in 2019 toward open and FAIR (Findable, Accessible, Interoperable and Reusable) data in a variety of ways. While most activities to enable the generation and management of open and FAIR research assets are the purview of the Platform’s Organize Module, promotion of and support for the implementation of Organize outputs extend across all three Platform modules.
In 2019, the Big Data Platform further enhanced the Global Agricultural Research Data and Innovation Network (GARDIAN) ecosystem and knowledge base, allowing users to find, visualize and map data assets generated by CGIAR and others, including the United States Agency for International Development (USAID), the Department for International Development (DFID), the United States Department of Agriculture (USDA) and the World Bank.
GARDIAN represents a one-stop shop for CGIAR publications and data that can be leveraged via in-development pipelines to create data products. GARDIAN easily provides information to other CGIAR sites and dashboards; for example, it is being used by CGIAR’s Managing Agricultural Research for Learning and Outcomes (MARLO) reporting system to validate and augment reporting on data assets.
Other GARDIAN highlights in 2019 include:
- The development of GARDIAN to be able to flag data with personally identifiable information to minimize risk to Centers and vulnerable individuals and to map and spatially query production estimates for 30+ crops, and a seven terabyte climate dataset.
- The release of a “Collaborative GARDIAN (CG) Labs” prototype to realize these possibilities, with tools and services that enable researchers to collaborate in finding and sharing GARDIAN (or other) data securely and to analyze it by sharing and using R and Python-based scripts and other approaches.
- The enhancement of researcher capacity to use CG Labs along with open and FAIR data and tools and shared services offered by the Platform via webinars and several data science workshops at Centers. To further build CGIAR capacity to create and manage open, FAIR and responsible data assets, a five-module course providing guidance on best practices was developed in 2019. This course will be used for interactive learning in 2020.
- A range of tools made available to support open, FAIR and ethical research outputs. These included semantic data standards, which were collaboratively developed by all Centers with leadership from the Alliance Bioversity International-International Center for Tropical Agriculture (CIAT), and v.1.0 of the Agronomy Field Information Management System (AgroFIMS). AgroFIMS is a product of the Organize module, which works together with the International Potato Center (CIP) and the Alliance, and employs these standards to generate FAIR data at collection.
Header photo: Chadrack Kafuti performs dendrometer analysis in Yangambi, Democratic Republic of the Congo. Photo by A. Fassio/CIFOR.