株式会社オブライト

Articles tagged "Apache 2.0"

6 articles

AI2026-06-11
DiffusionGemma Deep Dive — Google DeepMind's June 10, 2026 Open-Weight Text-Diffusion LLM, Same Backbone as Gemma 4 26B (A4B MoE), Up to 4× Faster Than AR Counterparts, Apache 2.0, With an Honest "Quality Trails AR" Disclosure
A primary-source deep dive on **DiffusionGemma** (`google/diffusiongemma-26B-A4B-it`, 25.2B total / 3.8B active MoE), released June 10, 2026 by Google DeepMind in coordination with NVIDIA. Grounded in the [official Google blog](https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation/), [ai.google.dev model card](https://ai.google.dev/gemma/docs/diffusiongemma/model_card), [Hugging Face card](https://huggingface.co/google/diffusiongemma-26B-A4B-it), and [NVIDIA's blog](https://blogs.nvidia.com/blog/rtx-ai-garage-local-gemma-diffusion/). Where autoregressive (AR) models generate one token at a time left-to-right, diffusion language models (DLMs) **denoise a 256-token canvas in parallel into final text**. 15-20 tokens commit per forward pass, up to 48 denoising steps, 1,000+ tok/sec on H100, 700+ on RTX 5090, ~3.5–4× the throughput of the AR Gemma 4 counterpart. Crucially, Google **openly states that quality lags AR**: MMLU Pro 77.6 vs 82.6, GPQA 73.2 vs 82.3, MMMU Pro 54.3 vs 73.8. Apache 2.0, distributed via Hugging Face / Vertex AI / NVIDIA NIM — the first large-scale open-weight diffusion LLM in the industry. The column covers practical implications for Japanese enterprises (on-prem internal agents, code editing, low-latency workflows) and positioning against Mercury (Inception Labs), LLaDA, and Gemini Diffusion.
Google DeepMindGemma 4DiffusionGemma
AI2026-06-04
Gemma 4 12B Deep Dive — The Encoder-Free Multimodal LLM That Runs on a 16GB Laptop Under Apache 2.0 (June 3, 2026)
A deep dive into Gemma 4 12B, released by Google DeepMind on June 3, 2026, grounded in the [official announcement](https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12b/) and [Developer Guide](https://developers.googleblog.com/gemma-4-12b-the-developer-guide/). The standout property is **encoder-free multimodal architecture** — replacing the prior vision encoder (~550M parameters) with a 35M-parameter lightweight embedder plus a single matrix multiplication, and removing the 12-layer Conformer audio encoder entirely by projecting raw audio straight into the LLM's embedding space. Runs on a 16GB VRAM laptop (Copilot+ PC or Apple Silicon Mac), shipped under Apache 2.0, available through Hugging Face / Ollama / LM Studio / MLX / Vertex AI on day one. Covers the architectural rationale, the "approaches 26B MoE at less than half the memory" benchmark claim, positioning within the Gemma 4 family (E2B / E4B / 26B / 31B), competitive comparison against Llama 4 / Qwen 3.5 / Phi-5, and the fit with Japanese enterprise on-prem AI, voice workflows, and data-sovereignty requirements.
Gemma 4Gemma 4 12BGoogle DeepMind
AI2026-05-21
Gemma 4 and the Google AI Studio Overhaul — What Google I/O 2026 Means for Open-Weight LLMs and Enterprise Adoption in Japan
Google I/O 2026 put a fresh spotlight on Gemma 4 (2B–31B, 256K context, 140 languages, Apache 2.0) and a major Google AI Studio overhaul featuring Kotlin vibe coding, one-click Cloud Run deployment, and the Managed Agents API. This column covers the full picture — hardware requirements, competitive positioning against Llama 4 and Qwen, and practical adoption guidance for Japanese enterprises.
GoogleGemma 4Google AI Studio
AI2026-05-01
Qwen 3.6-27B Released — Dense 27B Leads Agentic Coding, 40 tok/s on RTX 3090 [April 2026]
Qwen 3.6-27B Dense from Alibaba's Qwen Team, released April 22, 2026: 77.2 on SWE-bench Verified, 59.3 on Terminal-Bench 2.0 (matching Claude 4.5 Opus), 262K-to-1M context, Apache 2.0 license, and 40 tok/s on an RTX 3090 with Q4_K_M — summarized from official sources.
Qwen 3.6AlibabaオープンソースLLM
AI2026-04-10
Mistral Small 4 Complete Guide — Unified Reasoning, Multimodal & Code in 119B MoE [2026]
Mistral Small 4, released March 2026, unifies reasoning, multimodal vision, and agentic coding in a 119B MoE model under Apache 2.0. Supports 11 languages including Japanese. Full specs, setup guide, and model comparisons.
Mistral Small 4MoEマルチモーダル
AI2026-03-17
Complete Guide to Rakuten AI 3.0 Architecture: Next-Gen Japanese LLM with MoE
A comprehensive analysis of Rakuten AI 3.0's Mixture of Experts architecture with 700B parameters. Explore the 8-expert configuration, 40B active parameter efficiency, and technical background behind achieving 8.88 on Japanese MT-Bench.
Rakuten AI 3.0MoEMixture of Experts