Skip to main content
株式会社オブライト

Glossary: AI

45 terms

AI3
Alignment

Alignment / AIアライメント / AI整合

The research field and engineering practice of ensuring AI systems act in accordance with human intentions, values, and ethics. RLHF, DPO, and Constitutional AI are its primary technical implementations.
AI4
Benchmark

Benchmark / AI評価指標 / ベンチマーク評価

A standardized set of tasks or datasets used to measure and compare AI model capabilities. MMLU, HumanEval, SWE-bench, and HLE are prominent examples covering reasoning, coding, and frontier difficulty.
AI2
BPE (Byte Pair Encoding)

BPE / Byte Pair Encoding / バイトペアエンコーディング

A subword tokenization algorithm that iteratively merges the most frequent byte/character pairs to build a vocabulary. Used in most major LLMs including the GPT family as the foundation of their tokenizers.
AI3
Chain-of-Thought (CoT)

Chain-of-Thought / CoT / 思考の連鎖

A prompting technique that elicits step-by-step reasoning from an LLM, significantly improving accuracy on multi-step math, logic, and planning tasks. Even the phrase 'think step by step' triggers CoT.
AI3
Constitutional AI

Constitutional AI / CAI / 憲法的AI

Anthropic's alignment approach in which an AI critiques and revises its own outputs according to a set of written principles ('constitution'), reducing harmful responses with less reliance on human labeling.
AI3
Context Window

Context Window / コンテキストウィンドウ / Context Length

The maximum number of tokens an LLM can process in a single inference call. Larger windows support longer documents and conversation histories but increase compute and memory costs.
AI3
Dense Model

Dense Model / Dense Transformer / 高密度モデル

A standard Transformer model where all parameters participate in processing every token, as opposed to MoE's sparse expert selection. Compute scales proportionally with parameter count.
AI4
Distillation (Knowledge Distillation)

Distillation / Knowledge Distillation / 知識蒸留

Training a smaller 'student' model to mimic the output distribution of a larger 'teacher' model, compressing capabilities into a lighter-weight model suited for edge deployment or cost reduction.
AI3
DPO (Direct Preference Optimization)

DPO / Direct Preference Optimization / 直接選好最適化

An alignment method that optimizes an LLM directly on human preference pairs without training a separate reward model, offering simpler implementation and more stable training than RLHF.
AI4
Embedding

Embedding / 埋め込み表現 / ベクトル埋め込み

The transformation of text or other data into high-dimensional vectors where semantic proximity is preserved — the core representation technique underlying RAG, semantic search, and recommendation systems.
AI3
Eval Harness

Eval Harness / Evaluation Harness / 評価ハーネス

A framework for evaluating LLM performance across multiple benchmarks in a unified pipeline. EleutherAI's LM Evaluation Harness is the most widely used, supporting hundreds of tasks and custom evaluations.
AI2
Few-shot

Few-shot Learning / Few-shot Prompting / 少数ショット学習

A prompting technique that includes a small number of input-output examples (typically 2-10) in the prompt so the LLM infers the desired task format and replicates the pattern.
AI3
Fine-tuning

Fine-tuning / ファインチューニング / 微調整

Additional training of a pre-trained foundation model on task- or domain-specific data to specialize its behavior or style. LoRA and QLoRA have made fine-tuning accessible without full parameter updates.
AI4
Foundation Model

Foundation Model / 基盤モデル

A large model pre-trained on broad data that can be adapted to many downstream tasks via fine-tuning or prompting. The category includes LLMs, vision models, audio models, and multimodal systems.
AI3
GraphRAG

GraphRAG / Graph Retrieval-Augmented Generation / グラフRAG

An extension of RAG that combines knowledge graphs with vector search to capture entity relationships, enabling more contextually rich answers than pure vector similarity can provide.
AI3
Grounding

Grounding / グラウンディング / 事実根拠付け

The practice of anchoring LLM outputs to verifiable external sources (documents, databases, search results) to prevent hallucination. RAG is the dominant technical approach to grounding.
AI4
Hallucination

Hallucination / 幻覚 / AI幻覚

The phenomenon where an LLM confidently generates factually incorrect content — fabricated citations, wrong figures, nonexistent APIs — one of the most significant risks in production LLM deployments.
AI4
Inference

Inference / 推論 / モデル推論

The process of running a trained AI model on new inputs to produce predictions or generated outputs. In LLMs, this is the text-generation step — distinct from the training process.
AI3
KV Cache (Key-Value Cache)

KV Cache / Key-Value Cache / キーバリューキャッシュ

A memory-level optimization that caches the Key and Value vectors computed during Transformer attention, avoiding recomputation of earlier tokens and speeding up autoregressive inference.
AI5
LLM (Large Language Model)

Large Language Model / 大規模言語モデル

A neural-network language model with billions to trillions of parameters, capable of text generation, translation, summarization, and code synthesis — the foundation of modern generative AI.
AI3
LoRA (Low-Rank Adaptation)

LoRA / Low-Rank Adaptation / 低ランク適応

A parameter-efficient fine-tuning method that freezes the original model weights and learns only small low-rank adapter matrices, drastically cutting memory and compute requirements.
AI4
MoE (Mixture of Experts)

MoE / Mixture of Experts / 混合エキスパート

A model architecture with multiple specialized sub-networks (experts) where only a sparse subset is activated per token, allowing parameter counts to scale without proportionally increasing compute.
AI4
Multimodal

Multimodal AI / マルチモーダルAI / Multimodal Model

An AI model or system that handles multiple modalities — text, images, audio, and video — within a single architecture. GPT-4o and Gemini are representative examples.
AI3
Post-training

Post-training / ポストトレーニング / 事後学習

The collective term for all training phases after pre-training — SFT, RLHF, DPO, and other alignment methods — that transform a raw language model into a helpful, safe assistant.
AI3
Pre-training

Pre-training / 事前学習 / プレトレーニング

The initial large-scale training phase in which an LLM learns from vast text corpora via next-token prediction, establishing the general language and world knowledge that downstream fine-tuning and alignment build upon.
AI4
Prompt Engineering

Prompt Engineering / プロンプトエンジニアリング / プロンプト設計

The practice of designing and refining LLM input text to reliably elicit desired outputs — covering instruction structuring, few-shot examples, role assignment, and output-format specification.
AI3
QLoRA (Quantized LoRA)

QLoRA / Quantized Low-Rank Adaptation / 量子化LoRA

A fine-tuning method that combines LoRA with 4-bit quantization (NF4), enabling fine-tuning of 65B+ parameter models on a single consumer GPU.
AI4
Quantization

Quantization / 量子化 / モデル量子化

Converting model weights from 32-bit or 16-bit floats to lower-precision formats (8-bit, 4-bit, etc.) to reduce model size and memory footprint, enabling faster inference and local execution.
AI5
RAG (Retrieval-Augmented Generation)

Retrieval-Augmented Generation / 検索拡張生成

A technique that retrieves relevant information from an external knowledge base and grounds an LLM response on it — the mainstream approach for connecting LLMs to up-to-date or proprietary data.
AI4
ReAct (Reasoning and Acting)

ReAct / Reasoning and Acting / 推論と行動

An agent framework that interleaves LLM reasoning traces with external actions (tool calls, searches, code execution), enabling the model to gather information and verify intermediate steps before concluding.
AI4
Red Teaming

Red Teaming / AI Red Teaming / レッドチーミング

A safety evaluation practice in which testers deliberately probe an AI model with adversarial, harmful, or manipulative prompts to surface vulnerabilities before deployment.
AI3
RLHF (Reinforcement Learning from Human Feedback)

RLHF / Reinforcement Learning from Human Feedback / 人間のフィードバックからの強化学習

A training paradigm in which human raters compare model outputs, a reward model is trained on those preferences, and the LLM is then optimized via RL to match human intent — the technique that made ChatGPT conversationally useful.
AI2
Self-Consistency

Self-Consistency / 自己整合性

A decoding strategy that samples multiple Chain-of-Thought reasoning paths for the same prompt and selects the most frequent final answer by majority vote, improving reliability over greedy decoding.
AI3
Semantic Search

Semantic Search / 意味検索 / セマンティック検索

Search that ranks results by semantic similarity rather than keyword overlap, implemented via embedding vectors and approximate nearest-neighbor retrieval.
AI5
SLM (Small Language Model)

Small Language Model / 小規模言語モデル

A language model in the hundreds-of-millions to low-billions parameter range. Key advantages are local execution, lower cost, and low latency — ideal for edge devices and domain-specific tasks.
AI3
Speculative Decoding

Speculative Decoding / 投機的デコーディング / スペキュレイティブデコーディング

An inference acceleration technique where a small draft model generates multiple candidate tokens, which a large target model then verifies in a single forward pass — achieving 3-4x speedups without quality loss.
AI3
System Prompt

System Prompt / システムプロンプト / システム指示

An initial instruction sent to an LLM before user conversation begins, defining the model's role, tone, constraints, and output format — the primary control surface for application-level behavior.
AI2
Temperature

Temperature / 温度パラメータ / サンプリング温度

A hyperparameter that controls randomness in LLM text generation. Values near 0 produce deterministic, consistent outputs; higher values yield more diverse and creative responses.
AI2
Tokenization

Tokenization / トークン化 / トークナイゼーション

The pre-processing step that splits text into tokens (sub-word units, characters, or symbols) that the LLM operates on. Tokenization design affects model performance, cost, and multilingual capability.
AI2
Top-k Sampling

Top-k / Top-k Sampling / 上位Kサンプリング

A decoding method that restricts next-token sampling to the k highest-probability tokens. Simple to implement but produces uneven sampling width because k is fixed regardless of the probability distribution.
AI1
Top-p (Nucleus Sampling)

Top-p / Nucleus Sampling / 核サンプリング

A decoding method that samples from the smallest set of tokens whose cumulative probability exceeds p (the 'nucleus'), dynamically adjusting sampling width and balancing quality with diversity.
AI3
Training

Training / 学習 / モデル学習

The process of optimizing model parameters by learning patterns from data. In the LLM context it encompasses pre-training, fine-tuning, and alignment training phases.
AI2
Tree-of-Thought (ToT)

Tree-of-Thought / ToT / 思考の木

A reasoning framework that explores multiple thought branches in a tree structure, evaluating and pruning nodes to find optimal solutions — an extension of Chain-of-Thought for more complex problem solving.
AI4
Vector Database

Vector Database / ベクトルデータベース / Vector DB

A database purpose-built to store high-dimensional embedding vectors and return nearest-neighbor results via approximate search (ANN) — the core storage layer in most RAG architectures.
AI2
Zero-shot

Zero-shot Learning / Zero-shot Prompting / ゼロショット学習

A prompting approach that provides only a task instruction with no examples, relying entirely on the model's pre-trained knowledge. Modern frontier LLMs handle many tasks well zero-shot.