microsoft/phi-2

📊 Model Parameters

Total Parameters 3,617,751,040
Context Length 2,048
Hidden Size 2560
Layers 32
Attention Heads 32
KV Heads 32

💾 Memory Requirements

FP32 (Full) 13.48 GB
FP16 (Half) 6.74 GB
INT8 (Quantized) 3.37 GB
INT4 (Quantized) 1.68 GB

🔑 KV Cache (Inference)

Per Token (FP16) 327.68 KB
Max Context FP32 1.25 GB
Max Context FP16 640.0 MB
Max Context INT8 320.0 MB

⚙️ Model Configuration

Core Architecture

Vocabulary Size51,200
Hidden Size2,560
FFN Intermediate Size10,240
Number of Layers32
Attention Heads32
KV Heads32

Context & Position

Max Context Length2,048

Attention Configuration

Attention Dropout0%
Tied EmbeddingsNo

Activation & Normalization

Activation Functiongelu_new
RMSNorm Epsilon1e-05

Dropout (Training)

Residual Dropout10.0%
Embedding Dropout0%

Special Tokens

BOS Token ID50,256
EOS Token ID50256

Data Type

Model Dtypefloat16
Layer Types:
Attention
MLP/FFN
Normalization
Embedding