VIEWPOINT article

Front Sci, 21 May 2026

Volume 4 - 2026 | https://doi.org/10.3389/fsci.2026.1860463

This is part of an article hub

Artificial intelligence research agents in soil science: the continuing importance of domain expertise

  • Department of Physical Geography, Utrecht University, Utrecht, Netherlands

Key points

  • Artificial intelligence (AI) research agents may shift some key analytical decisions in soil science from explicit human judgment to algorithmic inference, changing how research workflows are organized.

  • Because soil systems are complex, sparsely observed, and unevenly represented in data and literature, AI-driven research risks reinforcing biases toward easily measurable variables and well-observed processes.

  • The effective use of AI in soil science will depend on human expertise to guide, interpret, and critically evaluate AI-assisted research workflows.

Introduction

The integration of multi-agent artificial intelligence (AI) systems into soil science research, as proposed by Minasny et al. () in their Frontiers in Science lead article, represents a significant advancement in AI-driven scientific discovery. AI’s significance has already surpassed the task-specific machine learning (ML) tools already established in areas such as digital soil mapping (), soil spectroscopy (), and pedotransfer functions (). By combining specialized AI agents that can synthesize literature and data, generate hypotheses, and propose experimental designs, such systems could accelerate several stages of the research process ().

However, applying AI research agents to soil science raises important questions due to the nature of soil. Soils are inherently complex, spatially heterogeneous (), and often only sparsely observed, and many key soil processes unfold over timescales that exceed the duration of most available datasets. These characteristics limit how fully soil systems can be captured in data and, therefore, how effectively AI agents can generate meaningful insights from them. In this viewpoint, I argue that, while multi-agent AI systems may substantially improve research efficiency, their successful use in soil science will depend on domain expertise to guide and critically evaluate AI-based workflows.

AI agents in soil science workflows

Current AI applications in soil science are largely based on ML models trained to perform clearly defined analytical tasks. For example, ML models can predict topsoil clay content from environmental geodata () or estimate the laboratory-measured organic carbon content of a soil sample from its spectral reflectance (). In such applications, researchers remain central to the workflow: scientists decide which datasets to use, how to prepare them, how to handle missing data, which modeling approach to select, and how to evaluate model performance. These decisions are based on the study objective, available data, and established good practice in the field. The ML model then performs the specific task for which it was designed, at a pre-defined step within that workflow.

In contrast, AI research agents go beyond this scientist-centered workflow by shifting some decisions from explicit human judgment to algorithmic inference. Rather than operating within a predefined workflow, an AI agent can iteratively analyze a dataset and propose analytical strategies including preprocessing steps, variable selection and transformation, and suitable modeling approaches. In effect, steps that have traditionally formed part of the analytical reasoning of researchers—including generating and evaluating hypotheses ()—are no longer solely human decisions but are instead increasingly inferred in collaboration with AI systems. This shift could greatly increase efficiency and make knowledge and tools more accessible, particularly in interdisciplinary fields such as soil science where research often requires multi-domain expertise.

Limits and risks of AI agents in soil science

While global soil datasets may seem extensive (), they remain sparse relative to the inherent complexity of the soil systems they describe. Soils are formed through multi-layered interactions between the climate, organisms, topography, and parent material over very long time periods, and many relevant processes operate simultaneously across wide spatial and temporal scales (). Yet, observations of these processes are limited and indirect. Only an extremely small fraction of the world’s soil volume has ever been sampled, and many measurements capture static properties or simple proxies rather than the processes themselves. Greater spatial or temporal resolution does not necessarily solve this problem. For example, remote sensing imagery measures spectral reflectance only from the top millimeter of bare soil, and this signal must be locally calibrated before it can be linked to soil properties. Likewise, soil moisture sensors measure volumetric water content at a point but do not directly capture processes such as preferential flow or macropore connectivity. Even when collected at high temporal resolution, such measurements provide only a partial representation of unsaturated flow behavior. In this sense, digital soil twins derived from sensor and remote-sensing data view soil systems through a keyhole only. As in Plato’s allegory of the cave, interpretations are based on observable projections of the system, while important underlying processes remain only partly visible.

In addition, soil datasets are often highly heterogeneous in how measurements are collected and reported. Harmonization is, therefore, essential, but it is often difficult to achieve because differences in measurement protocols are frequently confounded with differences in land use or pedoclimatic region. In such cases, observational variation cannot easily be disentangled from the underlying soil processes themselves (). As a result, AI-based knowledge discovery may identify measurement artefacts as meaningful patterns rather than methodological inconsistencies.

Beyond data harmonization, accessibility remains a persistent challenge in soil science. The effectiveness of AI research agents depends on centralized datasets, many of which are held by public institutions where data sharing often relies on trust and a clear understanding of how data will be used. The growing use of complex and potentially opaque AI systems may introduce new concerns for data owners and, in some cases, reduce rather than enhance their willingness to share data.

Importantly, AI research agents do not rely solely on structured datasets; they also draw on the broader scientific literature, containing a wealth of expert knowledge accumulated over decades of soil research. However, the literature is shaped by many of the same practical constraints that affect data collection. Research tends to focus on variables that are comparatively easy to measure and on environments where systematic data collection is feasible. Consequently, both the datasets and the literature from which AI systems learn are biased toward well-observable soil properties and processes. This observability bias is not new in soil science, but it may become more pronounced in data-driven AI research workflows. Moreover, as AI agents provide synthesized outputs, researchers may become further removed from raw data and original studies. This distancing, together with the tendency of AI research to produce outputs that appear coherent but may contain errors (), could substantially increase the risk of biased conclusions.

Examples of this bias can already be seen in digital soil mapping, where terrain is often identified as a dominant soil-forming factor because it is consistently available in spatial datasets (). By contrast, parent material composition or local weathering processes—which may be equally or even more important in many settings—tend to rank much lower because detailed data are often unavailable. Similarly, soil texture is commonly used to explain hydraulic properties because it is widely measured, whereas soil structure and pore connectivity, both of which strongly influence water flow, are much less frequently analyzed because they are harder to sample.

Human domain expertise should continue to guide research

The limitations outlined above do not diminish the potential value of AI research agents in soil science. Rather, they underscore the continuing importance of human expertise in guiding their use. Soil science has a long tradition of conceptual frameworks—for example, those describing pedogenesis, soil–landscape relationships, or biogeochemical cycles—that represent accumulated scientific understanding and can provide essential guidance for AI-driven exploration. In this sense, the most promising applications of AI agents in soil science may arise not from fully autonomous systems but from approaches that combine data-driven inference with human expertise.

One possible pathway is the development of expert-in-the-loop systems, in which AI agents generate hypotheses or analytical strategies that are subsequently evaluated and refined by human experts. In such a workflow, AI agents can rapidly explore large bodies of data and literature, while human expertise provides the interpretive context needed to distinguish plausible mechanisms from spurious correlations.

However, integrating soil expertise into computational systems presents its own challenges. Much of the knowledge used by experienced pedologists is tacit and qualitative, developed through field observation and comparative reasoning rather than through explicitly quantified relationships. Concepts such as soil structure development, horizon differentiation, or soil formation along landscape gradients often involve nuanced judgement that is difficult to translate directly into numerical constraints for ML models. As a result, efforts to formalize such knowledge for computational use may be difficult and even meet resistance from domain experts.

These challenges suggest that the role of soil scientists in AI-assisted research may increasingly shift toward critically assessing AI-generated outputs and translating domain knowledge in ways that can meaningfully guide AI systems. In this scenario, scientists move away from manually designing every step of a project or analysis and toward supervising AI-generated research pathways and evaluating their plausibility. Rather than reducing the need for scientific judgment, AI research agents may increase the importance of human expertise, process understanding, and interdisciplinary dialogue to ensure that computational exploration remains grounded in pedological reasoning.

Both in soil science and across the scientific community generally, AI research agents may be most valuable when used as partners in scientific reasoning rather than as autonomous discovery systems.

Statements

Author contributions

MN: Conceptualization, Writing – original draft, Writing – review & editing.

Funding

The author declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author declared that this work was conducted in the absence of financial relationships that could be construed as a potential conflict of interest.

The author declared shared consortia, AI4SoilHealth and Intergenerational Open Geospatial Carbon Registry (OGCR), with the lead article author PS to the handling editor.

Generative AI statement

The author declared that generative AI was used in the creation of this manuscript. ChatGPT 5.3 and DeepL were employed to assist with the article’s structuring and improving the clarity, grammar, and overall quality of the English language. The author verified and takes full responsibility for the use of generative AI in the preparation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  • 1

    MinasnyBMcBratneyADemattêJAMRomán DobarcoMSmithP. Enhancing soil science research with multi-agent artificial intelligence systems. Front Sci (2026) 4:1721295. doi: 10.3389/fsci.2026.1721295

  • 2

    LamichhaneSKumarLWilsonB. Digital soil mapping algorithms and covariates for soil organic carbon mapping and their implications: a review. Geoderma (2019) 352:395413. doi: 10.1016/j.geoderma.2019.05.031

  • 3

    Viscarra RosselRABehrensTBen-DorEBrownDJDemattêJAMShepherdKDet al. A global spectral library to characterize the world’s soil. Earth-Sci Rev (2016) 155:198230. doi: 10.1016/j.earscirev.2016.01.012

  • 4

    WeberTKDWeihermüllerLNemesABechtoldMDegréADiamantopoulosEet al. Hydro-pedotransfer functions: a roadmap for future development. Hydrol Earth Syst Sci (2024) 28(14):3391–433. doi: 10.5194/hess-28-3391-2024

  • 5

    KrennMPolliceRGuoSYAldeghiMCervera-LiertaAFriederichPet al. On scientific understanding with artificial intelligence. Nat Rev Phys (2022) 4(12):761–9. doi: 10.1038/s42254-022-00518-3

  • 6

    JennyH. Factors of soil formation: a system of quantitative pedology. New York, NY: McGraw-Hill (1941).

  • 7

    ChenSArrouaysDLeatitia MulderVPoggioLMinasnyBRoudierPet al. Digital mapping of GlobalSoilMap soil properties at a broad scale: a review. Geoderma (2022) 409:115567. doi: 10.1016/j.geoderma.2021.115567

  • 8

    BatjesNHCalistoLde SousaLM. Providing quality-assessed and standardised soil data to support global mapping and modelling (WoSIS snapshot 2023). Earth Syst Sci Data (2024) 16:4735–65. doi: 10.5194/essd-16-4735-2024

  • 9

    MesseriLCrockettMJ. Artificial intelligence and illusions of understanding in scientific research. Nature (2024) 627:4958. doi: 10.1038/s41586-024-07146-0

Summary

Keywords

AI research agents, expert-in-the-loop, knowledge discovery, soil health, soil science

Citation

Nussbaum M (2026) Artificial intelligence research agents in soil science: the continuing importance of domain expertise. Front Sci 4:1860463. doi: 10.3389/fsci.2026.1860463

Received

20 April 2026

Accepted

06 May 2026

Published

21 May 2026

Volume

4 - 2026

Edited and reviewed by

Luca Brocca, National Research Council (CNR), Italy

Updates

Copyright

*Correspondence: Madlene Nussbaum,

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics