The use of spatial information and artificial intelligence to analyze a regional food system using French press articles about farmers' markets

The use of spatial information and artificial intelligence to analyze a regional food system using French press articles about farmers' markets
Updated at: 19/02/2026

This collection provides an original corpus of texts (in French) that can be used to train and/or evaluate AI models dedicated to recognizing named entities and tracking the dynamics of territorial food systems.

The GeoTextAI4SAT collection comprises three complementary datasets that can be used for training and evaluating models for the recognition of named entities.

  • The first dataset (https://doi.org/10.57745/ISUT2Q) contains 11,538 annotated news articles published between 1994 and 2024.
  • A second dataset (https://doi.org/10.57745/WX6PEJ) contains both automatic and manual annotations on a subset of a total of 6,508 articles. These annotations were made using the two generic agnostic models: GliNER (Zaratiana et al., 2024) and NuNER (Bogdanov et al., 2024).
  • Finally, a third dataset (https://doi.org/10.57745/B3THLZ) contains annotations made manually by a group of expert students in the field on a subset of 92 articles that were published between 1997 and 2011.

This collection was produced as part of the GeoTextAI4SAT project which combined language models and knowledge graphs. The aim was to analyze the dynamics of actors in territorial food systems (SAT) with a particular focus on farmers' markets and their environment. The work was conducted in collaboration with the mixed research units (UMR) Innovation, TETIS, and IRIT and was supported by the following: the O3T Défi Clef project, the Plat4terfood project (ANR-23-PESA-0005), and the European AI4AGRI project.

These datasets contain digital reproductions of copyrighted works. They were collected and analyzed under the text mining exception for scientific research purposes (article L112-5-3 of the French Intellectual Property Code).

THIAM, Pape Ibrahima; AKERMANN, Grégori; CHASSERAY, Yohann; MOTHE, Josiane; PRADERE, Manon; ROCHE, Mathieu; TEISSEIRE, Maguelonne, 2026, "Données annotées automatiquement et consolidées des acteurs des circuits courts des systèmes alimentaires territoriaux", https://doi.org/10.57745/WX6PEJ, Recherche Data Gouv, V2