Two open-weights families. Picked for what they do best.

Mistral and Qwen are the two strongest "second-tier" open-weights model families in 2026 — below Llama on raw ecosystem reach but each leading on a specific dimension that matters in production. Mistral AI (France) leads on function-calling reliability, European multilingual, and Apache-2.0-licensed releases. Qwen (Alibaba) leads on code-specific work via Qwen Coder and on Asian multilingual including Chinese, Japanese, and Korean. Together they cover use cases where Llama and Gemma are the wrong shape.

Both Live · production-ready Open weights Mistral · Apache 2.0 (most variants) Qwen · Apache 2.0 / Tongyi licence

01 · What they are

The two families that handle what Llama and Gemma don't.

Mistral AI is the French AI lab founded in 2023 by ex-DeepMind and ex-Meta researchers. The company shipped Mistral 7B in late 2023, Mixtral 8x7B (the first popular open-weights mixture-of-experts model) in early 2024, Mistral Large in 2024, and continues to ship at frequent cadence. The pattern: most Mistral models are released under Apache 2.0 — genuinely open-source, commercially unrestricted — while a few flagship variants (Mistral Large 2, the latest commercial-tier models) ship under a commercial-research licence. The mix gives you Apache-licensed open weights for production and the flagship-tier for paid hosted access.

Qwen is Alibaba's open-weights family. The Qwen 2.5 (late 2024) and Qwen 3 (2025) generations cover a wide size range — from Qwen 2.5 0.5B for edge devices up to Qwen 2.5 72B and Qwen 3 235B for frontier-adjacent work. Qwen 2.5 Coder is the specialised code variant and is widely considered the strongest open-weights coding model in its size class — meaningfully better than Llama or Gemma at code generation, refactoring, and software-engineering tasks. The general Qwen models also lead on Asian-language work given Alibaba's training-data emphasis.

Both families are widely hosted — Ollama, Together AI, Fireworks, Hugging Face, vLLM-via-self-host, and (for Qwen) the official Alibaba Cloud Model Studio. Both run anywhere Llama runs, with the same OpenAI-compatible API surface.

Why one combined leaf for two families

This page treats Mistral and Qwen as a complementary pair because that's how they're typically chosen in production: you don't pick "Mistral or Qwen"; you pick "Mistral for X, Qwen for Y". They cover different shapes of work that Llama and Gemma don't fit cleanly. Combining them keeps the comparison logic together and reflects the way teams actually evaluate the open-weights field. If either becomes load-bearing enough to warrant a deeper dedicated leaf in future, the split is straightforward.

02 · The lineups

What each family ships, and where it's strongest.

Mistral · flagship

Mistral Large 2

Top-tier, frontier-adjacent quality. ~123B parameters. Strongest function-calling reliability of any open-weights model. Multilingual including French, German, Italian, Spanish at frontier quality. Commercial-research licence; hosted access via la Plateforme.

Mistral · default

Mistral Small / Nemo

Apache 2.0 licensed. The everyday open-weights pick when function-calling matters. Mistral Nemo (12B) is a particularly strong "small" variant. Runs comfortably on Ollama for dev work.

Mistral · MoE

Mixtral 8x7B / 8x22B

Mixture-of-experts variants. Mixtral 8x22B has 141B total params, ~39B active per token — punches above its weight on quality vs latency. Apache 2.0. Production-popular when you want strong quality at lower active inference cost.

Qwen · flagship

Qwen 3 235B / Qwen 2.5 72B

Frontier-adjacent quality. Strong reasoning and multilingual including Chinese / Japanese / Korean. Apache 2.0 licensed. The right pick when Asian-language work or Chinese-trained-data alignment is structurally needed.

Qwen · default

Qwen 2.5 14B / 32B

Mid-tier general purpose. Apache 2.0. Strong on Asian languages and code-adjacent tasks. Runs on Ollama, vLLM, Together — broad hosting support.

Qwen · specialist

Qwen 2.5 Coder · 7B / 14B / 32B

The best open-weights coding model in its size class. Often outperforms Llama and Gemma on code benchmarks at equivalent parameter counts. Apache 2.0. The structural pick when you're building coding agents on open-weights.

Where each family wins on benchmarks

Mistral Large 2 consistently leads open-weights function-calling benchmarks — the model was specifically tuned for tool use. Qwen 2.5 Coder 32B is competitive with Claude Sonnet and GPT-4 on code-specific benchmarks — the strongest open-weights coding model below frontier-tier closed models. Mistral Nemo and small Qwen variants outperform similarly-sized Llama models on multilingual work for their respective language families. Llama and Gemma still win on general English benchmarks at most size points, but the second-tier families fill the niches Llama and Gemma don't.

03 · vs Llama, Gemma, closed-frontier

When Mistral or Qwen is the right pick.

The honest framing: both families are second-tier ecosystems vs Llama. You pick them when their specific strengths matter more than ecosystem reach. Choose Llama by default; choose Mistral or Qwen when their structural advantage matches your use case.

Use case	Best pick	Why
General-purpose open-weights workhorse	Llama 3.x / 4	Largest community ecosystem; broadest hosting
Frontier-lab safety tuning + multimodal	Gemma 3	Google DeepMind training pipeline
Function-calling-heavy production agents	Mistral Large 2 / Mistral Nemo	Tuned for tool use; most reliable open-weights function-calling
European multilingual (FR / DE / IT / ES)	Mistral Large 2	Frontier quality on European languages
Apache 2.0 strict licence requirement	Mistral Small / Nemo / Mixtral	Genuinely open-source vs Llama Community Licence
Code-generating agents on open-weights	Qwen 2.5 Coder	Strongest open-weights coder; close to closed-frontier on code benchmarks
Asian-language work (CN / JA / KO)	Qwen 2.5 / 3	Alibaba training-data emphasis
Mixture-of-experts inference economics	Mixtral 8x22B	Lower active inference cost vs equivalent dense models
Frontier reasoning quality	Claude Opus / GPT-5 / Gemini Ultra	Closed frontier still leads on hardest reasoning

04 · Decision guide

Pick Mistral or Qwen when. Skip when.

Use Mistral / Qwen when

Function-calling reliability is load-bearing — Mistral Large 2 leads open-weights here
You're building a coding agent on open-weights — Qwen 2.5 Coder is the right pick
European multilingual matters (Mistral) or Asian multilingual (Qwen)
Apache 2.0 strict licensing is non-negotiable — Mistral has the most variants here
You want MoE inference economics — Mixtral 8x22B is the open-weights MoE leader
You want vendor diversity beyond Llama / Gemma in your open-weights routing

Skip when

You want the largest community fine-tune ecosystem — Llama wins
Native multimodal across text / image / video / audio — Gemma 3 fits better
Frontier reasoning quality — closed frontier (Claude / GPT / Gemini) still leads
You need 128k+ context with reliable long-context use — Gemma 3 / closed frontier are stronger
Your hosting provider has limited Mistral / Qwen variant coverage and you don't want to self-host
Specifically code-heavy work at frontier scale — Claude Opus 4.7 / GPT-5 still beat Qwen Coder on the hardest tasks

05 · South African context

Where Mistral / Qwen lands in SA delivery work.

Mistral · the function-calling-on-open-weights pick

For SA studios building production agents on open-weights with function-calling load-bearing — tool dispatch, structured outputs, multi-step workflows — Mistral Large 2 is meaningfully more reliable than Llama or Gemma at the tool-use surface. Hosted via Together / Fireworks; self-host via vLLM. The Apache-licensed variants (Nemo, Small, Mixtral) avoid the Llama 700M MAU edge cases entirely and the "Built with Llama" attribution requirement.

Qwen Coder · the open-weights coding agent

For SA studios building internal coding agents (PR review, refactor assistants, codebase Q&A) on open-weights, Qwen 2.5 Coder is the right pick. Runs on Ollama for dev work; production via Together AI or self-hosted vLLM. The structural cost advantage over Claude Code or GPT-driven coding agents matters at high-volume internal usage — especially when many developers run many agent calls per day.

SA hosting reality

Mistral and Qwen variants are less consistently available in africa-south1 (Vertex) or af-south-1 (Bedrock) than Llama. Most Mistral variants are reachable via Together AI / Fireworks / Mistral's own la Plateforme; most Qwen variants via Together AI / Fireworks / Alibaba Cloud directly. For strict SA-residency requirements, check current cloud provider model availability before committing — the situation evolves quickly. For development, Ollama on a maxed MacBook handles all three families locally regardless of hosting question.

06 · Connections