Alibaba is running the same play Red Hat perfected with Linux: give the software away, charge for everything around it.
This week Qwen, Alibaba's AI research arm, released Qwen3.6-35B-A3B — a coding model that runs efficiently on consumer hardware, activating only 3 billion of its 35 billion total parameters per inference under the Apache 2.0 open-source license. The same week, it released Qwen3.6-Plus as a proprietary, API-only offering accessible only through Alibaba Cloud, according to Wikipedia's Qwen entry. Taken together, the dual-release is a two-track strategy that reveals how Alibaba is positioning itself in the agentic coding race against Google, OpenAI, and Anthropic.
The open-weights model is the funnel. Developers who download, experiment with, and deploy Qwen3.6-35B-A3B on their own infrastructure are being steered toward a proprietary endpoint they will eventually need: the Plus tier, which carries usage-based API pricing on Alibaba's cloud. It is a classic land-grab play dressed in the language of open-source community stewardship.
The benchmarks support the strategy. On SWE-bench Verified — a gold-standard test of how well AI models handle real software engineering tasks drawn from open-source repositories — Qwen3.6-35B-A3B scored 73.4, compared to Google Gemma4-31B's 52.0. On Terminal-Bench 2.0, which measures multi-step agentic behavior at the terminal, it scored 51.5 against Gemma4-31B's 42.9. These are not marginal improvements. The 3-billion-active-parameter model is beating Google's 31-billion-dense model by margins that suggest the architecture has crossed a practical threshold.
The architecture in question is a sparse mixture-of-experts design, and it matters for the economics. A model that activates only 3 billion parameters per inference runs on hardware that costs a fraction of what a dense 31B model requires. An engineer with a consumer GPU can now run a coding agent that matches or exceeds what Google offers through its paid API.
Google has no immediate answer. Its Gemma4-31B is a dense model — every inference activates the full parameter count — and Google's release cadence has not produced a comparable open-weights sparse coding model. The gap Alibaba has opened is not just performance; it is an architectural one that translates directly into cost advantages at inference time.
The question is whether the funnel holds. If Qwen3.6-Plus appears on third-party model marketplaces or cloud providers outside Alibaba's control, the acquisition theory collapses — developers would have a managed option that does not route through Alibaba Cloud. Alibaba's incentive is to keep the Plus tier exclusive. How strictly that exclusivity holds will determine whether this strategy delivers the cloud adoption it is designed to produce.
What is already clear is that the dual-release pattern is not accidental. Qwen has released multiple model checkpoints as open weights while keeping larger or more capable variants proprietary — the same way Red Hat distributed RHEL as open-source while building an enterprise services business on top. The open-source label is real; so is the monetization layer underneath it.