Qwen 3.6-27B Released — Dense 27B Leads Agentic Coding, 40 tok/s on RTX 3090 [April 2026]
Qwen 3.6-27B Dense from Alibaba's Qwen Team, released April 22, 2026: 77.2 on SWE-bench Verified, 59.3 on Terminal-Bench 2.0 (matching Claude 4.5 Opus), 262K-to-1M context, Apache 2.0 license, and 40 tok/s on an RTX 3090 with Q4_K_M — summarized from official sources.
Qwen 3.6-27B at a glance
Alibaba's Qwen Team released Qwen 3.6-27B on April 22, 2026 — the first Dense (non-MoE) open-weight model in the Qwen 3.6 family. It's published on Hugging Face and ModelScope under Apache 2.0, allowing commercial use and fine-tuning, with substantial improvements aimed at agentic coding.
Reported benchmarks
Headline numbers from the official announcement:
| Benchmark | Qwen 3.6-27B |
|---|---|
| SWE-bench Verified | 77.2 |
| Terminal-Bench 2.0 | 59.3 (matches Claude 4.5 Opus) |
| QwenWebBench | 1487 |
The model reportedly beats both its predecessor Qwen 3.5-27B and the much larger Qwen 3.5-397B-A17B MoE on multiple tasks. Matching Claude 4.5 Opus on Terminal-Bench 2.0 with a 27B open model is the headline.
Architecture highlights
Despite being 27B Dense, Qwen 3.6-27B introduces a "Thinking Preservation" mechanism (per the official blog) and combines Gated DeltaNet linear attention with traditional self-attention in a hybrid architecture. Native context is 262,144 tokens, extensible to roughly 1,010,000. The model also accepts text, image, and video inputs.
Hardware — 40 tok/s on RTX 3090
Reported runs show ~40 tokens/sec on an RTX 3090 24GB with Q4_K_M quantization, passing 10 of 10 functional tests. Hitting practical throughput on a workstation-class consumer GPU makes this 27B Dense open model genuinely deployable for SMBs. Apple Silicon (M3 Max 64GB / M4 Max-class) is also viable for quantized inference.
How Oflight uses it
Qwen 3.6-27B is now a candidate local backend for OpenClaw (see OpenClaw 2026.4.23 release notes). It pairs well with internal-DX projects where confidential data shouldn't leave a private network but agentic coding is still required, and Apache 2.0 fine-tuning is a practical bonus. For deployment guidance, see AI Consulting.
FAQ
Q1: Worth upgrading from Qwen 3.5-9B? A: For coding-heavy workloads, likely yes. Note that 9B → 27B raises VRAM requirements significantly. Q2: Does it run on Mac mini? A: M4 Pro 32GB+ can run quantized Qwen 3.6-27B usefully. For M2 8GB-class, prefer Qwen 3.5-9B or Gemma 4 E4B. Q3: Is commercial use allowed? A: Apache 2.0 — commercial use, modification, and redistribution are all permitted.
References
Feel free to contact us
Contact Us