A Shared Data Eco-System

This post was originally published on the main Persephone Habitat and Soil Management site on December 27th 2016. It was written as a response to GODAN’s (Global Open Data for Agriculture and Nutrition) discussion document ‘A Global Data Ecosystem for Agriculture and Food’ . It was and I believe still is the only response to the document. The advice and recommendations within having similarly not been pursued by the funding body. It is reproduced and updated here as it has relevance to the aims and objectives of this site and Nepal’s agricultural development.

A Global Data Ecosystem for Agriculture and Food

godan-global-dat-ecosystem

The executive summary of GODAN’s recent discussion document ‘A Global Data Ecosystem for Agriculture and Food’, calls for:

..a common data ecosystem, produced and used by diverse stake-holders, from smallholders to multinational conglomerates, a shared global data space..

The report identified stakeholder engagement, provenance in data sourcing and handling, sharing, and collaborative frameworks as key components in developing a global data ecosystem.

Stakeholder Engagement and Data Integrity

  • However within the agricultural sector “many groups might not have obvious motivation to participate in data sharing and use…
  • Thus “..in order to get trust-worthy data, there has to be a direct reward to the data supplier.”
  • The authors further state that “a large part of the motivation for data sharing has to do with how widely it will be shared, with whom and under what conditions.

There is, justified or otherwise, suspicion that data may be misappropriated to the provider’s dis-advantage or provide disproportionate advantage to others. The perceived risk of negative unforeseen consequences can outweigh any potential benefits of sharing data, particularly when those benefits can not be so readily quantified or realized in the short term.

Stakeholders may develop a ‘big brother’ mentality where they respond by withholding data or deliberately providing inaccurate data in the belief they are better served. This problem is amplified in the provenance of agricultural products, which “undergo a chain of transformations and pass through many hands on their way to the final consumer”. One drop in the veracity of data at any point in the chain potentially undermines all the data in that chain. These issues are sadly though not just relevant to small farmers and supply chain operators but are as prevalent and strongly held by many of the big data holders such as trans-national corporations, governments and academic institutions.

Informed Consent

Whilst the integrity of the source and the veracity of the data are important factors in building a global data ecosystem the authors further identified ‘documentation, support and interaction’ as key to fostering trust. Data providers and users need to interact so as to serve each others needs better and ensure that stakeholders feel included not just sampled. Stakeholders need to be confident that there are no negative consequences or disproportionate benefits from sharing data to the whole ecosystem.

Sharing Frameworks

Where the data is held, who maintains it, the veracity, accessibility and availability to the whole ecosystem as well as who pays to deliver those services are issues that also need to be addressed. A global data ecosystem cannot rely on single large repositories to act as data silo’s or individual data providers to maintain data crucial for network function. Data needs to be distributed and maintained across the system to prevent bottle necks and failure points . The concept of the ADS (application database storage) network which exploits the distributed network concept could potentially offer resolutions to many if not all these issues.

Data Conformity and Convention

Whilst stakeholders need an environment that is transparent, robust and secure, the data, as does all the documentation and support in that environment, needs to conform to certain conventions. The ‘five star open data maturity model (available, structured, non-proprietary format, referenceable [Sic] and linked)’ lays out a basic checklist. However these properties themselves need to further conform to taxonomies and naming conventions (controlled vocabularies) that are inter disciplinary and facilitate data from different sources being easily related. Conventions which must themselves be explained in and applied to any documentation and support.

Incentivization

In order to get trust-worthy data, there has to be a direct reward to the data supplier

For large stakeholders, governments and corporations that reward may come from the need to provide proof in meeting sustainable development goals and climate commitments, but with smaller stakeholders the same incentives may not apply. The question needs to be asked: ‘what’s the data worth?’ or more importantly ‘what is the cost of not having the data?’ Can we achieve global sustainability goals and climate objectives without the majority of stakeholders taking part? If we can’t, is it worth weighting benefits in the short term to favor the smaller stakeholders to encourage them? Even weighting that benefit in the form of payment for engaging, and if so can technologies such as blockchain be used to verify data and facilitate those payments?

Collaborative Frameworks

The authors draw attention to the fact that sharing data is only the start; “It is one thing to share data, but to achieve the desired gains from a data ecosystem for agriculture, to draw conclusions across the globe to guide decision making, it is necessary to exploit synergy between datasets efficiently.

Such synergies however arise out of a framework that extends beyond purely agricultural data to one that includes all environmental data. It is a framework that similarly needs to be able to seamlessly integrate with the more mundane economic, sociopolitical and legal data and frameworks, an integration that will itself give rise to greater synergies between our economic activities and their environmental consequences. Di-Functional Modeling (DFM), is one such framework tool.

Di-Functional Modelling (DFM)

.

Designed around the concept of soil fertility DFM was created to model the processes and resources that contribute to the sustainable management of an environmental project.

In the normal course these would be the agricultural unit, a group of units or a component in a unit such as a field, forest or grassland.

However DFM is not restricted to modeling soil fertility and can be used to model other mechanisms in the agricultural and wider environment. [Agriculture in a Zero Emissions Society]

DFM is not a database, blockchain or application but a framework or ‘ecosystem’ within which the inter-dependencies of the whole system can be more easily visualized and modeled. DFM can thus assist in the development of databases, blockchains and applications that are inter-operable and can exchange and verify environmental and agricultural data [Data Databases and Distributed Networks].

DFM models the processes and functions of an agricultural system relative to the whole. A whole that further extends to the interactions and exchanges that occur between natural systems and the socioeconomic systems they support. These sociopolitical, economic and legal system are themselves nested within the model.

These inner systems are connected to the environment by existing supply chain mechanisms, data from which can reveal the true sustainability or carbon footprint of agricultural goods [TRASE]. Further enhancement of these mechanisms with relevant data should make it possible to trace the ingredients of a chocolate bar from field to retail outlet, every step and any within to give a grand total of the true cost of the indulgence in terms of carbon, habitat or social impact. Once calculated the totals could be added to an individuals personal tally of GHG emissions, habitat loss and social deprivation. [strengthening the food chain with the block chain]

Sustainable Local Land Use

DFM was conceived for and is best used to help determine localized land use, crop choices and management strategies based on the available resources and the soil, habitat and hydrological properties. It is not a top down tool but a tool to be applied at the farm end; to provide a means to both audit the farm and it’s resources and structure that audit in a way that facilitates subsequent integration of wider scientific data.

Building an integrated Network

By repeating the process on successive farms and linking those farms through a content management system each audit would contribute to a greater more robust database; one that would permit each unit to enhance it’s own data with that from neighboring farms. Extended over a region the framework would help to manage and allocate resources, plan crop choices and integrate food production with the natural environment: A Shared Global Data Ecosystem that mirrors the Shared Global Ecosystem we call home.

Towards The Data Ready Farm

The Sustainable Farm

The sustainable farm and by extension a sustainable agricultural sector and planet, is one underpinned by knowledge and driven by data. Knowledge and data that can contribute to crop and livestock choices, resource management and ultimately reveal the sustainability, or not, of an enterprise.

The data ready farm is thus aware of it’s own resources, the resources of the surrounding environment and the relationship it has with those resources and the markets it supplies.

Local Knowledge: A Land Use Inventory

Whilst technology has a significant role to play, the data ready farm begins with knowledge of itself, the land use (woodland, cultivated, grassland), the inherent properties (soil and water resources), as well as the livestock and crops that depend on those resources. It is a simple inventory that requires no equipment to perform.

  • Land Use (Woodland, Cultivated, Grassland)
  • Inherent Properties (Soil Texture, Water Resources)
  • Land Dependents (Livestock, Crop Choices, Human dependents)

The inventory should distinguish land use according to basic habitat criteria: woodland, grassland and cultivated. The cultivated habitats further differentiate into arable (short rotation), permanent (orchards, vineyards, etc) or heterogeneous (covered crops, flowers, etc). The woodlands and grasslands similarly differentiate but at this point only grasslands land connected with farming need to be differentiated. The boundaries between and within the habitats, along with any hedgerows, fences or banks on those boundaries, and the position of any wells, standing or running water within them should also be recorded and mapped. Even if the farm appears homogeneous, has only one land use, crop or livestock, it is still likely made up of several parcels of land with varying properties; properties that are not easily visible in themselves but can be revealed by the recording and analysis of simple data, such as soil texture.

Soil Texture

Soil texture, a property that arises out of the relative proportions of sand silt and clay strongly influences the hydrological and nutrient characteristics of the soil. Variations in soil texture across a field or farm can reveal changes in the hydrology or nutrient status of the soil and provide the foundation upon which to determine crop choices, water requirements and cultivation methods.

hand textural chart by S Nortcliff and JS Lang from Rowell (1994)

Soil texture can be measured by taking a small sample of soil from just below the surface (10cm). Moistened with water or spit the sample is then molded with the hands into a ball. The ball is then deformed and it’s malleability noted and checked against a chart. The sample is usually taken along a ‘W’ transect positioned across the face of a field and the data bulked to provide a single textural class for that field/plot.

All that now remains is to quantify the livestock and crop choices that depend on the land; at this point it is just a list of the type, number and location of stock and crops.

This basic reconnaissance map, which needs no equipment to create, can be mapped to identify the land use, crop choices, soil texture, location of water and number of livestock not only of one farm but of an entire region.

A Local Inventory in a Global Context

With remote sensing and mobile technology the inventory and soil data could be annotated directly onto a map from the field. Coupled with Geo-statistical strategies this could be further developed to create complex contour maps of textural variation across the agricultural landscapes. With additional external scientific, environmental and economic data this local inventory could be qualified relative to a global economy.

.Scientific Data

Into this inventory scientific data relevant to the sustainable management of resources and the husbandry of crops and livestock can be appended.

  • Meteorology (precipitation figures).
  • Crop Data (nutrition, culture, pest and disease)
  • Livestock Data (nutrition, stocking numbers, general husbandry)
  • Soil Mineral data (345 nutrient model)

Environmental Data

Integration of environmental data can help the farm be sympathetic to the needs of the natural environment and the species that inhabit it. Aware of the environments and species around it the data ready farm can identify synergies and conflicts and then use that data to find resolutions to conservation, pollution and emissions issues.

  • Conserve habitats and species
  • Prevent pollution from soil erosion and nutrient leaching
  • Reduce emissions from livestock and management practices

Market Data

To meet global sustainability goals the data ready farm must link and integrate with the ‘wider’ economic, sociopolitical and legal frameworks. Data from supply chain mechanisms, political policies, and legal and administrative bodies must integrate seamlessly with data from the agricultural and natural environments to meet Sustainable Development Goals (SDG’s) and climate action/change objectives.

  • Supply Chain (TRASE, blockchain)
  • Legal Framework (COP22 Objective)
  • Political policy (Paris Agreement)

A Local Data Hub

A farm that is aware of itself, the environment and the markets it supplies has the means to measure it’s sustainability relative to the environment and the markets. However a farm integrated with neighboring farms can improve it’s sustainability. A locally connected farm has greater resilience and can better manage and share resources, integrate crop and livestock choices, and supply markets more efficiently. A local data hub can connect remote farmers and help to build trust and educate in using and sharing data.

Applications and Databases

To move beyond a simply inventory and into a sustainable data driven future requires the development of applications and databases that compliment the framework. Some such as TRASE already exist but local databases and applications to share data within a comprehensive and structured framework still needs development. [Data Databases and Distributed Networks]

Leave a Reply

Your email address will not be published. Required fields are marked *