Abstract
Personalized food recommendation can promote healthier, sustainable eating, but current systems often rely on sparse and unstructured data, limiting semantic expressiveness and diverse personalization. In this paper, we propose FoodNexus, a large-scale knowledge graph with nearly one billion triples designed to enrich food recommendation with structured, nutrition-aware, and user-contextual information. We built it via a multi-stage pipeline that combines and augments the largest public dataset of user–recipe interactions, HUMMUS, with extensive metadata from Open Food Facts by linking recipes to concrete food products, extracting user traits from their biographies and reviews, and mapping both data sources onto the same ontology. Experiments show that FoodNexus enables richer, nutrition-sensitive evaluation of recommendations.
Motivation
Personalized food recommendation systems can play a crucial role in promoting healthier and more sustainable eating habits. However, current systems often rely on sparse and unstructured data, which limits their semantic expressiveness and ability to support diverse personalization needs.
Existing datasets for food recommendation face two key limitations:
- Lack of user-level constructs: Most datasets do not capture user preferences, dietary restrictions, or consumption traits
- Missing product granularity: Recipes cannot be aligned with real food products for nutritional and sustainability analysis
The Problem: No existing resource comprehensively supports nutrition, sustainability, and user constraints in food recommendation. Current systems cannot evaluate whether recommendations align with users' health goals or environmental concerns.
FoodNexus Knowledge Graph
FoodNexus is a large-scale knowledge graph designed to enrich food recommendation with structured, nutrition-aware, and user-contextual information. It combines and augments the largest public dataset of user-recipe interactions (HUMMUS) with extensive metadata from Open Food Facts.

Scale and Coverage
- ~979.5 million triples capturing food knowledge
- ~51 million entities across 11 entity types
- ~130.2 million attributes with 15 attribute types
- 11 relation types connecting entities
- 73% of entities and 70% of relations aligned with schema.org vocabulary for interoperability

Construction Pipeline
FoodNexus is built via a 5-stage pipeline that combines multiple data sources and enriches them with semantic knowledge:
Stage 1: Source Dataset Selection
We combine HUMMUS, the largest public dataset of user-recipe interactions, with Open Food Facts, a comprehensive database of food products with nutritional and sustainability information.
Stage 2: Combined Ontology Specification
We design a unified ontology that captures recipes, ingredients, products, users, and their relationships, aligned with schema.org vocabulary for interoperability.
Stage 3: Recipe-Product Entity Linking
Using the BAAI/bge-en-icl encoder with a 0.85 similarity threshold, we link recipe ingredients to concrete food products from Open Food Facts.
Stage 4: User Consumption Traits Extraction
We employ Qwen2.5 LLM to extract user traits (dietary preferences, restrictions, health goals) from their biographies and reviews.
Stage 5: Knowledge Graph Assembly
All extracted entities, relations, and attributes are assembled into a unified knowledge graph following the specified ontology.
Table 2: Relations with element names in FoodNexus.
| Source | Relation | Target | Description |
|---|---|---|---|
| UserGroup | publishesRecipe | Recipe | A user has published a recipe |
| UserGroup | publishesReview | UserReview | A user has published a review |
| UserGroup | hasConstraint | Tag | A user has a dietary constraint |
| UserReview | itemReviewed | Recipe | A review is associated with a recipe |
| Recipe | hasIndicator | Indicator | A recipe has a certain indicator |
| Product | hasIndicator | Indicator | A product has a certain indicator |
| Recipe | hasPart | Ingredient | A recipe includes a specific ingredient |
| Ingredient | isRelatedTo | Recipe | An ingredient is part of a recipe |
| Tag | suitableForDiet | Recipe | A constraint is compatible with a recipe |
| Tag | suitableForDiet | Product | A constraint is compatible with a product |
| Product | sameAs | Recipe | A product is similar/identical to a recipe |
| FoodProducer | produces | Product | A producer produces a product |
| Recipe | sameAs | Recipe | A recipe is an alternative or similar recipe |
| Store | offers | Product | A store sells a certain product |
| Store | isPlaceIn | City | A store is located in a certain city |
| Product | countryOfAssembly | Country | A product is assembled in a country |
Statistics and Comparison
Table 3: Comparative statistics between original data sources and FoodNexus.
| Data Source | # Triples | # Entities | # Attributes | # E. Types | # R. Types | # A. Types |
|---|---|---|---|---|---|---|
| HUMMUS | ~53.9M | ~12.3M | ~22.8M | 6 | 6 | 9 |
| HUMMUS (inferred) | ~57.9M | ~12.3M | ~22.8M | 6 | 7 | 14 |
| OFF | ~267.9M | ~38.5M | ~107.4M | 7 | 6 | 7 |
| FoodNexus (Ours) | ~979.5M | ~51.0M | ~130.2M | 11 | 11 | 15 |
E. Types: Entity Types; R. Types: Relation Types; A. Types: Attribute Types
Experimental Results
We evaluate FoodNexus through two research questions:
RQ1: Recommendation Utility
We compare several recommender systems on FoodNexus to assess recommendation quality:
Table 4: Performance comparison across recommendation models with Hit, Recall, and NDCG on top-10/20/50 lists.
| Model | Hit@10 | Hit@20 | Hit@50 | Recall@10 | Recall@20 | Recall@50 | NDCG@10 | NDCG@20 | NDCG@50 |
|---|---|---|---|---|---|---|---|---|---|
| Pop | 0.0035 | 0.0050 | 0.0077 | 0.0008 | 0.0010 | 0.0013 | 0.0008 | 0.0009 | 0.0009 |
| BPR | 0.0474 | 0.0777 | 0.1361 | 0.0136 | 0.0244 | 0.0467 | 0.0102 | 0.0133 | 0.0189 |
| NeuMF | 0.0562 | 0.0886 | 0.1491 | 0.0175 | 0.0276 | 0.0515 | 0.0131 | 0.0160 | 0.0220 |
| LightGCN | 0.0604 | 0.0930 | 0.1558 | 0.0184 | 0.0307 | 0.0557 | 0.0140 | 0.0173 | 0.0235 |
| MKR | 0.0413 | 0.0656 | 0.1126 | 0.0166 | 0.0272 | 0.0477 | 0.0111 | 0.0140 | 0.0188 |
| KTUP | 0.0446 | 0.0737 | 0.1260 | 0.0189 | 0.0315 | 0.0552 | 0.0118 | 0.0154 | 0.0210 |
| KGAT | 0.0580 | 0.0903 | 0.1453 | 0.0231 | 0.0372 | 0.0628 | 0.0152 | 0.0191 | 0.0252 |
| UserKGAT | 0.0187 | 0.0368 | 0.0802 | 0.0084 | 0.0164 | 0.0370 | 0.0046 | 0.0069 | 0.0116 |
| UserKTUP | 0.0388 | 0.0690 | 0.1260 | 0.0154 | 0.0295 | 0.0570 | 0.0093 | 0.0133 | 0.0198 |
| UserMKR | 0.0461 | 0.0744 | 0.1264 | 0.0187 | 0.0319 | 0.0556 | 0.0120 | 0.0157 | 0.0212 |
- LightGCN achieves the highest Hit scores
- KGAT achieves the best Recall and NDCG
- UserMKR (user-aware variant) can capture useful KG patterns when properly designed
RQ2: Nutritional and Contextual Analysis
FoodNexus enables nutrition-sensitive evaluation, revealing systematic biases in current recommender systems:


- Gluten bias: Models tend to over-represent gluten-containing items
- Packaging bias: Interaction-based systems over-represent plastic-packaged products
- Sustainability skew: Systematic bias toward less sustainable items detected
- Nutritional signals: Models under-represent vitamins and potassium while showing mixed results for sugars and fats
Key Finding: FoodNexus enables richer, nutrition-sensitive evaluation of recommendations, revealing biases that would be invisible with traditional datasets.
Key Contributions
- Large-Scale Knowledge Graph: Nearly one billion triples linking recipes, products, users, and nutritional information
- User Trait Extraction: Novel pipeline using LLMs to extract dietary preferences and health goals from user content
- Recipe-Product Linking: Semantic matching to connect recipes with real food products for nutritional analysis
- Nutrition-Aware Evaluation: Framework for assessing recommendation fairness regarding health and sustainability
- Schema.org Alignment: High interoperability with existing semantic web standards
Resources
The FoodNexus dataset and code are publicly available:
- GitHub Repository — Code, documentation, and usage examples
- Zenodo Dataset — Full dataset download (DOI: 10.5281/zenodo.15710771)
BibTeX
@inproceedings{boratto2026foodnexus,
author = {Boratto, Ludovico and Fenu, Gianni and Marras, Mirko and Medda, Giacomo and Zedda, Giovanni},
title = {FoodNexus: Massive Food Knowledge for Recommender Systems},
booktitle = {Proceedings of the 48th European Conference on Information Retrieval},
series = {ECIR '26},
year = {2026},
publisher = {Springer},
doi = {10.1007/978-3-032-21321-1_52},
url = {https://doi.org/10.1007/978-3-032-21321-1_52}
}