mistralai/Ministral-8B-Instruct-2410

📊 Model Parameters

Total Parameters 8,019,808,256
Context Length 32,768
Hidden Size 4096
Layers 36
Attention Heads 32
KV Heads 8

💾 Memory Requirements

FP32 (Full) 29.88 GB
FP16 (Half) 14.94 GB
INT8 (Quantized) 7.47 GB
INT4 (Quantized) 3.73 GB

🔑 KV Cache (Inference)

Per Token (FP16) 147.46 KB
Max Context FP32 9.00 GB
Max Context FP16 4.50 GB
Max Context INT8 2.25 GB

⚙️ Model Configuration

Core Architecture

Vocabulary Size131,072
Hidden Size4,096
FFN Intermediate Size12,288
Number of Layers36
Attention Heads32
Head Dimension128
KV Heads8

Context & Position

Max Context Length32,768
Sliding Window Size32,768
RoPE Base Frequency100000000.0
Layer Attention Types[36 items]

Attention Configuration

Tied EmbeddingsNo
Attention Dropout0%

Activation & Normalization

Activation Functionsilu
RMSNorm Epsilon1e-05

Special Tokens

BOS Token ID1
Pad Token IDNot set
EOS Token ID2

Data Type

Model Dtypebfloat16
Layer Types:
Attention
MLP/FFN
Normalization
Embedding