株式会社オブライト

Articles tagged "MoE"

7 articles

DeepSeek V4 Preview Released — 1.6T MoE / 1M-Token Context Open-Weight Model [April 2026]

Overview of DeepSeek V4 Preview, released on April 24, 2026: two open-weight Mixture-of-Experts variants (V4-Pro at 1.6T total / 49B active and V4-Flash at 284B / 13B), 1-million-token context, weights on Hugging Face, and rollout via API and chat — based on official information.

DeepSeek V4オープンソースLLMMoE

GLM-5.1 Complete Guide — #1 SWE-bench Pro Open-Source LLM [April 2026]

GLM-5.1 by Z.ai (released April 7, 2026) is the first open-source LLM to top SWE-bench Pro at 58.4%, surpassing GPT-5.4 (57.7%) and Claude Opus 4.6 (57.3%). This guide covers its 744B/40B-active MoE architecture, MIT license, 8-hour autonomous task capability, and setup via Ollama.

GLM-5.1Z.aiSWE-bench

Kimi K2.5 Complete Guide — 1 Trillion Parameter MIT-Licensed Open-Source LLM [2026]

Kimi K2.5, released by Moonshot AI on January 27, 2026, is a 1 trillion parameter (32B active) MoE model under the MIT License. It scores 76.8% on SWE-bench, 99.0% on HumanEval, and 87.6% on GPQA Diamond. This guide covers its architecture, hardware requirements, Ollama setup, and practical use cases.

Kimi K2.5Moonshot AI1兆パラメータ

Mistral Small 4 Complete Guide — Unified Reasoning, Multimodal & Code in 119B MoE [2026]

Mistral Small 4, released March 2026, unifies reasoning, multimodal vision, and agentic coding in a 119B MoE model under Apache 2.0. Supports 11 languages including Japanese. Full specs, setup guide, and model comparisons.

Mistral Small 4MoEマルチモーダル

MiniMax M2.5 Complete Guide — Lightning Attention Achieves 80.2% SWE-bench [2026]

MiniMax M2.5 achieves 80.2% on SWE-bench Verified using proprietary Lightning Attention in a 230B MoE model. Full breakdown of architecture, benchmarks, license terms, and setup instructions.

MiniMax M2.5SWE-benchLightning Attention

Complete Guide to Rakuten AI 3.0 Architecture: Next-Gen Japanese LLM with MoE

A comprehensive analysis of Rakuten AI 3.0's Mixture of Experts architecture with 700B parameters. Explore the 8-expert configuration, 40B active parameter efficiency, and technical background behind achieving 8.88 on Japanese MT-Bench.

Rakuten AI 3.0MoEMixture of Experts

NemoClaw's NIM Inference Microservices and Nemotron Models — Deployment Strategies from Edge to Cloud

A technical deep dive into NemoClaw's NIM inference microservices and Nemotron model family. We examine containerized API endpoints, elastic scaling, Nemotron 3 Super performance (120B parameters, MoE with 12B active), deployment comparisons across AWS, Azure, GCP, and on-premises, lightweight edge device operations, and partner integration use cases with Salesforce, CrowdStrike, and more.

NemoClawNIMNemotron