openai-community/gpt2-xl

📊 Model Parameters

Total Parameters 1,638,022,400
Context Length 1,024
Hidden Size 1600
Layers 48
Attention Heads 25
KV Heads 25

💾 Memory Requirements

FP32 (Full) 6.10 GB
FP16 (Half) 3.05 GB
INT8 (Quantized) 1.53 GB
INT4 (Quantized) 781.1 MB

🔑 KV Cache (Inference)

Per Token (FP16) 307.20 KB
Max Context FP32 600.0 MB
Max Context FP16 300.0 MB
Max Context INT8 150.0 MB

⚙️ Model Configuration

Core Architecture

Vocabulary Size50,257
Number of Layers48
Attention Heads25
FFN Intermediate SizeNot set

Context & Position

Max Context Length1,024

Attention Configuration

Attention Dropout10.0%
Tied EmbeddingsYes

Activation & Normalization

Activation Functiongelu_new
RMSNorm Epsilon1e-05

Dropout (Training)

Residual Dropout10.0%
Embedding Dropout10.0%

Special Tokens

BOS Token ID50,256
EOS Token ID50256
Pad Token IDNot set

Data Type

Model DtypeNot set
Layer Types:
Attention
MLP/FFN
Normalization
Embedding