
Why Microsoft Is Teaching AI to Understand Local Accents and Dialects
In a major move to support Europe’s digital future, Microsoft has unveiled two ambitious projects in Paris aimed at preserving the continent’s rich linguistic diversity and cultural legacy — while ensuring both are better represented in the next generation of artificial intelligence.
Announced as an expansion of its European Digital Commitments , these initiatives reflect Microsoft’s growing focus on building AI that doesn’t just speak English, but understands the complex tapestry of European identities, dialects, and histories .
As large language models (LLMs) increasingly shape how we search, create, and communicate, there’s a growing risk: much of today’s AI is trained on data dominated by American English, leaving smaller languages and regional cultures underrepresented — or worse, erased.
Brad Smith, Microsoft’s Vice Chair and President, put it clearly:
“AI that doesn’t understand Europe’s languages, histories, and values can’t fully serve its people, its businesses, or its future.”
These new efforts aim to change that — starting now.
🔹 The Language Gap in AI: Why It Matters
Despite being home to over 200 living languages , Europe’s voice is often faint in global AI systems. Research shows significant performance drops when leading LLMs process content in non-English European languages.
For example, the open-source model Llama 3.1 scores more than 15 points lower in Greek and a staggering 25 points below its English performance in Latvian . That means while it excels in English tasks, it struggles with accuracy, context, and nuance in other European tongues — placing it near the bottom of benchmarks for those languages.
This imbalance isn’t just technical — it affects real-world applications:
- Poor translation quality
- Misunderstood customer queries
- Limited accessibility for non-English speakers
- Reduced innovation in local markets
Without intervention, this gap could widen, putting smaller economies and communities at a disadvantage in the AI-driven economy.
🔹 Bridging the Divide: A New Push for Multilingual AI
To tackle this challenge, Microsoft is launching a dedicated effort through its Open Innovation Centre (MOIC) and AI for Good Lab , based in Strasbourg, France .
The goal? Build high-quality, ethically sourced multilingual datasets on Microsoft Azure to train AI models in ten under-represented European languages , including:
- Estonian
- Slovak
- Greek
- Maltese
- Alsatian
These teams will collaborate with universities, cultural institutions, linguists, and tech startups across the continent to gather authentic texts, transcripts, and spoken language samples — all essential for training accurate, culturally aware AI.
Crucially, Microsoft is not doing this alone.
Starting September 1, 2025 , the company will open a call for proposals inviting organizations to contribute digital materials suitable for AI development. Selected partners will receive:
- Azure cloud credits
- Technical guidance from Microsoft engineers
- Access to AI tools and frameworks
All contributions will help expand the pool of open, reusable data — empowering researchers, developers, and public institutions to build inclusive technology.
🔹 Preserving History with AI: Notre-Dame Joins Culture AI Project
Alongside the language initiative, Microsoft is expanding its Culture AI program this autumn with a stunning new project: creating a high-fidelity digital twin of Notre-Dame Cathedral in Paris .
In partnership with the French Ministry of Culture and heritage digitization expert Iconem , the team will use advanced photogrammetry, drone imaging, and AI-powered reconstruction to capture every stone, arch, and stained-glass window of the 862-year-old Gothic masterpiece in extraordinary detail.
This effort goes beyond documentation — it’s about ensuring resilience . Just as digital records helped guide Notre-Dame’s physical restoration after the 2019 fire, this AI-enhanced replica could support future preservation, education, and virtual access for generations to come.
It’s also part of a broader mission. Previous Culture AI projects have already preserved:
- Ancient Olympia (Greece)
- Mount Saint-Michel (France)
- St. Peter’s Basilica (Vatican)
- Normandy D-Day landing sites
Each site becomes a living archive — accessible online, explorable in mixed reality, and usable for research, tourism, and cultural education.
🔹 Rooted in Decades of Localization Experience
These initiatives don’t come out of nowhere. Microsoft has been investing in European localization for over 40 years .
Today:
- Windows supports more than 90 languages , including all official EU languages and regional variants like Basque, Catalan, Galician, Luxembourgish, and Valencian
- Microsoft 365 offers full Office interfaces in over 30 European languages
- Cloud services are available across multiple EU regions with strict compliance standards
Now, the company is going further — integrating local culture and language directly into the foundation of AI and cloud platforms.
But here’s what sets this apart: Microsoft emphasizes that these efforts are supportive, not proprietary . The data, tools, and expertise shared through these programs are intended to be open and collaborative , helping public institutions, educators, and innovators build on them freely.
🔹 Why This Matters for Europe’s Future
These aren’t just symbolic gestures — they’re strategic investments in Europe’s digital sovereignty, cultural continuity, and technological independence .
By making sure AI reflects Europe’s true diversity, Microsoft aims to:
- Empower small-language communities
- Boost competitiveness in AI innovation
- Strengthen trust in digital systems
- Safeguard irreplaceable heritage
In a world where technology shapes identity, communication, and power, representation matters — down to the last accent mark.
And now, thanks to these initiatives, Europe’s voice may finally be heard — loud and clear — in the age of artificial intelligence.