FoodNexus: Massive Food Knowledge for Recommender Systems

Ludovico Boratto; Gianni Fenu; Mirko Marras; Giacomo Medda; Giovanni Zedda

FoodNexus: Massive Food Knowledge for Recommender Systems

Ludovico Boratto, Gianni Fenu, Mirko Marras, Giacomo Medda, Giovanni Zedda

48th European Conference on Information Retrieval (ECIR) 2026

Paper Code DOI

FoodNexus: Massive Food Knowledge for Recommender Systems - Overview

Abstract

Personalized food recommendation can promote healthier, sustainable eating, but current systems often rely on sparse and unstructured data, limiting semantic expressiveness and diverse personalization. In this paper, we propose FoodNexus, a large-scale knowledge graph with nearly one billion triples designed to enrich food recommendation with structured, nutrition-aware, and user-contextual information. We built it via a multi-stage pipeline that combines and augments the largest public dataset of user–recipe interactions, HUMMUS, with extensive metadata from Open Food Facts by linking recipes to concrete food products, extracting user traits from their biographies and reviews, and mapping both data sources onto the same ontology. Experiments show that FoodNexus enables richer, nutrition-sensitive evaluation of recommendations.

Motivation

Personalized food recommendation systems can play a crucial role in promoting healthier and more sustainable eating habits. However, current systems often rely on sparse and unstructured data, which limits their semantic expressiveness and ability to support diverse personalization needs.

Existing datasets for food recommendation face two key limitations:

Lack of user-level constructs: Most datasets do not capture user preferences, dietary restrictions, or consumption traits
Missing product granularity: Recipes cannot be aligned with real food products for nutritional and sustainability analysis

The Problem: No existing resource comprehensively supports nutrition, sustainability, and user constraints in food recommendation. Current systems cannot evaluate whether recommendations align with users' health goals or environmental concerns.

FoodNexus Knowledge Graph

FoodNexus is a large-scale knowledge graph designed to enrich food recommendation with structured, nutrition-aware, and user-contextual information. It combines and augments the largest public dataset of user-recipe interactions (HUMMUS) with extensive metadata from Open Food Facts.

FoodNexus ontology overview — **Figure 1:** Visual overview of the FoodNexus ontology showing entities and their relationships.

Scale and Coverage

~979.5 million triples capturing food knowledge
~51 million entities across 11 entity types
~130.2 million attributes with 15 attribute types
11 relation types connecting entities
73% of entities and 70% of relations aligned with schema.org vocabulary for interoperability

Comparison of data sources — **Table 1:** Comparison of feature-rich data sources for food recommendation.

Construction Pipeline

FoodNexus is built via a 5-stage pipeline that combines multiple data sources and enriches them with semantic knowledge:

Stage 1: Source Dataset Selection

We combine HUMMUS, the largest public dataset of user-recipe interactions, with Open Food Facts, a comprehensive database of food products with nutritional and sustainability information.

Stage 2: Combined Ontology Specification

We design a unified ontology that captures recipes, ingredients, products, users, and their relationships, aligned with schema.org vocabulary for interoperability.

Stage 3: Recipe-Product Entity Linking

Using the BAAI/bge-en-icl encoder with a 0.85 similarity threshold, we link recipe ingredients to concrete food products from Open Food Facts.

Stage 4: User Consumption Traits Extraction

We employ Qwen2.5 LLM to extract user traits (dietary preferences, restrictions, health goals) from their biographies and reviews.

Stage 5: Knowledge Graph Assembly

All extracted entities, relations, and attributes are assembled into a unified knowledge graph following the specified ontology.

Table 2: Relations with element names in FoodNexus.

Source	Relation	Target	Description
UserGroup	publishesRecipe	Recipe	A user has published a recipe
UserGroup	publishesReview	UserReview	A user has published a review
UserGroup	hasConstraint	Tag	A user has a dietary constraint
UserReview	itemReviewed	Recipe	A review is associated with a recipe
Recipe	hasIndicator	Indicator	A recipe has a certain indicator
Product	hasIndicator	Indicator	A product has a certain indicator
Recipe	hasPart	Ingredient	A recipe includes a specific ingredient
Ingredient	isRelatedTo	Recipe	An ingredient is part of a recipe
Tag	suitableForDiet	Recipe	A constraint is compatible with a recipe
Tag	suitableForDiet	Product	A constraint is compatible with a product
Product	sameAs	Recipe	A product is similar/identical to a recipe
FoodProducer	produces	Product	A producer produces a product
Recipe	sameAs	Recipe	A recipe is an alternative or similar recipe
Store	offers	Product	A store sells a certain product
Store	isPlaceIn	City	A store is located in a certain city
Product	countryOfAssembly	Country	A product is assembled in a country

Statistics and Comparison

Table 3: Comparative statistics between original data sources and FoodNexus.

Data Source	# Triples	# Entities	# Attributes	# E. Types	# R. Types	# A. Types
HUMMUS	~53.9M	~12.3M	~22.8M	6	6	9
HUMMUS (inferred)	~57.9M	~12.3M	~22.8M	6	7	14
OFF	~267.9M	~38.5M	~107.4M	7	6	7
FoodNexus (Ours)	~979.5M	~51.0M	~130.2M	11	11	15

E. Types: Entity Types; R. Types: Relation Types; A. Types: Attribute Types

Experimental Results

We evaluate FoodNexus through two research questions:

RQ1: Recommendation Utility

We compare several recommender systems on FoodNexus to assess recommendation quality:

Table 4: Performance comparison across recommendation models with Hit, Recall, and NDCG on top-10/20/50 lists.

Model	Hit@10	Hit@20	Hit@50	Recall@10	Recall@20	Recall@50	NDCG@10	NDCG@20	NDCG@50
Pop	0.0035	0.0050	0.0077	0.0008	0.0010	0.0013	0.0008	0.0009	0.0009
BPR	0.0474	0.0777	0.1361	0.0136	0.0244	0.0467	0.0102	0.0133	0.0189
NeuMF	0.0562	0.0886	0.1491	0.0175	0.0276	0.0515	0.0131	0.0160	0.0220
LightGCN	0.0604	0.0930	0.1558	0.0184	0.0307	0.0557	0.0140	0.0173	0.0235
MKR	0.0413	0.0656	0.1126	0.0166	0.0272	0.0477	0.0111	0.0140	0.0188
KTUP	0.0446	0.0737	0.1260	0.0189	0.0315	0.0552	0.0118	0.0154	0.0210
KGAT	0.0580	0.0903	0.1453	0.0231	0.0372	0.0628	0.0152	0.0191	0.0252
UserKGAT	0.0187	0.0368	0.0802	0.0084	0.0164	0.0370	0.0046	0.0069	0.0116
UserKTUP	0.0388	0.0690	0.1260	0.0154	0.0295	0.0570	0.0093	0.0133	0.0198
UserMKR	0.0461	0.0744	0.1264	0.0187	0.0319	0.0556	0.0120	0.0157	0.0212

LightGCN achieves the highest Hit scores
KGAT achieves the best Recall and NDCG
UserMKR (user-aware variant) can capture useful KG patterns when properly designed

RQ2: Nutritional and Contextual Analysis

FoodNexus enables nutrition-sensitive evaluation, revealing systematic biases in current recommender systems:

Categorical analysis heatmap — **Figure 2:** Average per-user difference Δ between category proportions in models' top-k lists and global category prevalence for Allergens, Packaging, and Green Score.

Nutritional analysis — **Figure 3:** Average per-user difference Δ for numerical nutritional attributes including Added Sugars, Saturated Fat, Potassium, Fiber, and Vitamins.

Gluten bias: Models tend to over-represent gluten-containing items
Packaging bias: Interaction-based systems over-represent plastic-packaged products
Sustainability skew: Systematic bias toward less sustainable items detected
Nutritional signals: Models under-represent vitamins and potassium while showing mixed results for sugars and fats

Key Finding: FoodNexus enables richer, nutrition-sensitive evaluation of recommendations, revealing biases that would be invisible with traditional datasets.

Key Contributions

Large-Scale Knowledge Graph: Nearly one billion triples linking recipes, products, users, and nutritional information
User Trait Extraction: Novel pipeline using LLMs to extract dietary preferences and health goals from user content
Recipe-Product Linking: Semantic matching to connect recipes with real food products for nutritional analysis
Nutrition-Aware Evaluation: Framework for assessing recommendation fairness regarding health and sustainability
Schema.org Alignment: High interoperability with existing semantic web standards

Resources

The FoodNexus dataset and code are publicly available:

GitHub Repository — Code, documentation, and usage examples
Zenodo Dataset — Full dataset download (DOI: 10.5281/zenodo.15710771)

BibTeX

@inproceedings{boratto2026foodnexus,
  author = {Boratto, Ludovico and Fenu, Gianni and Marras, Mirko and Medda, Giacomo and Zedda, Giovanni},
  title = {FoodNexus: Massive Food Knowledge for Recommender Systems},
  booktitle = {Proceedings of the 48th European Conference on Information Retrieval},
  series = {ECIR '26},
  year = {2026},
  publisher = {Springer},
  doi = {10.1007/978-3-032-21321-1_52},
  url = {https://doi.org/10.1007/978-3-032-21321-1_52}
}