microsoft/Phi-3-mini-4k-instruct

📊 Model Parameters

Total Parameters 3,722,578,944
Context Length 4,096
Hidden Size 3072
Layers 32
Attention Heads 32
KV Heads 32

💾 Memory Requirements

FP32 (Full) 13.87 GB
FP16 (Half) 6.93 GB
INT8 (Quantized) 3.47 GB
INT4 (Quantized) 1.73 GB

🔑 KV Cache (Inference)

Per Token (FP16) 393.22 KB
Max Context FP32 3.00 GB
Max Context FP16 1.50 GB
Max Context INT8 768.0 MB

⚙️ Model Configuration

Core Architecture

Vocabulary Size32,064
Hidden Size3,072
FFN Intermediate Size8,192
Number of Layers32
Attention Heads32
KV Heads32

Context & Position

Max Context Length4,096
RoPE Base Frequency10000.0
Sliding Window Size2,047

Attention Configuration

Attention Dropout0%
Tied EmbeddingsNo
Attention BiasNo

Activation & Normalization

Activation Functionsilu
RMSNorm Epsilon1e-05

Dropout (Training)

Residual Dropout0%
Embedding Dropout0%

Special Tokens

BOS Token ID1
EOS Token ID32000
Pad Token ID32,000

Data Type

Model Dtypebfloat16
Layer Types:
Attention
MLP/FFN
Normalization
Embedding