📊 Full opportunity report: AMÁLIA · The Three Hard Questions. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
Portugal’s AMÁLIA, a €5.5 million European Portuguese language model, is now operational. However, experts question its openness, native data sufficiency, and optimization goals, highlighting broader issues in Europe’s sovereign LLM projects.
Portugal’s €5.5 million AMÁLIA language model is now operational, making it a significant milestone in Europe’s sovereign AI efforts. While the model outperforms previous open models on Portuguese benchmarks, experts are raising three critical questions about its openness, native-language data sufficiency, and strategic goals, which could influence future national AI projects across Europe.
AMÁLIA is a consortium project involving approximately 60 researchers from Portugal’s top academic institutions, launched by the government in December 2024. The model, based on a continuation of the EuroLLM multilingual foundation, was completed in September 2025 and is currently accessible through the FCT’s IAedu platform to 450,000 academic users. It handles text in Portuguese up to the end of 2023 and is expected to reach its final version by June 2026.
Technically, AMÁLIA was not trained from scratch but built as a continuation of an existing multilingual model, EuroLLM, with a focus on Portuguese data. It outperforms previous open models on Portuguese benchmarks and surpasses Qwen 3-8B on most tests, though it still trails Qwen on ALBA, its primary Portuguese benchmark. The training involved 107 billion tokens, with 5.8 billion from Portugal’s web archive, Arquivo.pt, representing about 5.5% of the total, and 17-18% of supervised fine-tuning data being Portuguese.
AMÁLIA
The three hard
questions.
Portugal spent €5.5M to build a European Portuguese LLM. The base version is operational, the benchmarks beat Qwen 3-8B on most pt-PT tasks. So why are the most important questions still unanswered?
Last month, Duarte O.Carmo published the sharpest public analysis of AMÁLIA — Portugal’s state-funded European Portuguese large language model. He prefaces his critique with the necessary diplomatic apparatus before doing what almost nobody else in the European-sovereign-LLM discourse has been willing to do publicly: asking hard questions about whether the work, as released, actually does what it set out to do. This piece is a structural extension of his analysis. The AMÁLIA case study exposes three hard questions every national LLM effort needs to answer publicly — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
Three questions every national LLM effort needs to answer publicly.
Duarte O.Carmo’s framing maps cleanly onto the structural argument. Each question lands specifically in AMÁLIA — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
The three questions form a structural feedback loop. Q3 (optimization target) determines Q2 (data volume needed) which conditions Q1 (openness sufficient for community contribution). The European sovereign-LLM movement collectively benefits from these questions becoming standard methodology disclosure, not exceptional critique.

APRENDA GitLab CI/CD: Implemente DevOps com Deploys Automatizados e Feedback Contínuo (Infraestrutura & Automação Brasil) (Portuguese Edition)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
107 billion tokens. 5.8 billion clearly pt-PT.
The structurally tractable question with a structurally surprising answer. For a model whose entire stated purpose is European Portuguese prioritization, the native-language share of extended pre-training is 5.5%. The implications cascade into every other question.

Portuguese Flash Cards – Learn Portuguese Language Vocabulary Words and Phrases – Basic Language for Beginners – Gift for Travelers, Kids, and Adults by Travelflips
PORTUGUESE FLASH CARDS – Basic Portuguese words and phrases for beginners and travelers
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
The Olmo standard. AMÁLIA’s current state.
Allen Institute for AI’s Olmo project defines what “fully open” operationally requires. Olmo doesn’t lead frontier benchmarks. That’s not the point. The point is to be the structural reference for openness. AMÁLIA’s “fully open source” claim should track to the operational standard.

The AI Infused Classroom
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Four strategic positions. AMÁLIA between two and three.
Approximately €100M+ in publicly disclosed European sovereign-LLM funding across the major initiatives. The structural question every project faces: what is the actual competitive position you’re staking? Four options — none mutually exclusive — but each requiring different commitments.

Easy Italian Phrase Book: Over 770 Phrases for Everyday Use (Dover Language Guides Italian)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three standards. For AMÁLIA and the movement.
The structural critique generalizes beyond AMÁLIA. Italy, France, Germany, Switzerland, the OpenEuroLLM consortium, and every subsequent national project benefit from public discourse holding national LLM efforts to operational standards on openness, data accounting, and strategic positioning.
The European sovereign-AI agenda is a serious strategic project that deserves serious public discourse. O.Carmo’s analysis is what serious public discourse looks like. Appropriately diplomatic. Structurally rigorous. Willing to ask the hard questions in public when the public investment justifies it. More of this is needed — across every European sovereign-LLM project, not just AMÁLIA.
Critical Questions for Portugal’s AI Strategy
The development of AMÁLIA highlights broader issues facing Europe’s sovereign AI initiatives, including transparency about openness, the adequacy of native-language data, and strategic objectives. These questions are vital for shaping national policies, ensuring accountability, and setting realistic expectations for AI capabilities in smaller language communities.
As Europe invests heavily in indigenous models, how these questions are answered will influence future funding, research directions, and the continent’s ability to develop truly autonomous AI systems that serve local languages and contexts. The way Portugal addresses these issues could serve as a blueprint or cautionary tale for other nations pursuing similar projects.
Europe’s Sovereign LLM Efforts and Portugal’s Role
Portugal’s AMÁLIA is part of a broader European movement to develop sovereign language models, with initiatives in Italy, Germany, France, Switzerland, Norway, and Sweden. These projects aim to reduce reliance on US or Chinese models, foster local AI ecosystems, and promote linguistic and cultural sovereignty.
Most of these efforts face similar structural questions about openness, native data sufficiency, and strategic priorities, but public discourse often focuses on individual model launches rather than the systemic challenges they reveal. Portugal’s investment and public deployment of AMÁLIA bring these issues into sharper focus, especially given the model’s public accessibility and government backing.
“AMÁLIA is an impressive piece of work, but it raises fundamental questions about openness and native data sufficiency that need honest answers.”
— Duarte O.Carmo
Unanswered Questions About AMÁLIA’s Openness and Goals
It remains unclear how open AMÁLIA truly is in practice, especially regarding access to model weights and training data transparency. Additionally, the adequacy of Portuguese native data and the strategic priorities guiding the model’s development are still under discussion. The final version, due in June 2026, may address some of these gaps, but current information is limited.
Next Steps for Portugal’s AI Model Development
The final version of AMÁLIA is scheduled for release in June 2026, which is expected to clarify some of the current uncertainties. Researchers and policymakers will closely monitor its capabilities, openness, and strategic alignment. Portugal’s government and research institutions are also likely to engage in broader discussions about transparency and data sufficiency, influencing future European projects.
Key Questions
What is the current status of AMÁLIA?
The base version is operational, publicly accessible through the FCT’s IAedu platform, and has demonstrated strong performance on Portuguese benchmarks. The final version is expected in June 2026.
What are the main concerns about AMÁLIA?
Experts are questioning how open the model truly is, whether the native Portuguese data used is sufficient, and what the strategic priorities are for its development and deployment.
Why does this matter for Europe’s AI landscape?
AMÁLIA exemplifies the broader systemic challenges facing European sovereign AI projects, which could influence policy, funding, and the future of indigenous language models across the continent.
Will the final version address current doubts?
It is possible that the June 2026 release will clarify issues around openness and data strategy, but this remains to be seen as the project progresses.
How does Portugal’s approach compare to other European efforts?
Portugal’s approach involves building on an existing multilingual foundation rather than training from scratch, differing from countries like Italy that focus on native-language training from the ground up. This strategic choice has implications for openness and data requirements.
Source: ThorstenMeyerAI.com