Model Architecture: openai-community/gpt2-xl

📊 Model Parameters

Total Parameters 1,638,022,400

Context Length 1,024

Hidden Size 1600

Layers 48

Attention Heads 25

KV Heads 25

FP32 (Full) 6.10 GB

FP16 (Half) 3.05 GB

INT8 (Quantized) 1.53 GB

INT4 (Quantized) 781.1 MB

Per Token (FP16) 307.20 KB

Max Context FP32 600.0 MB

Max Context FP16 300.0 MB

Max Context INT8 150.0 MB

Vocabulary Size50,257

Number of Layers48

Attention Heads25

FFN Intermediate SizeNot set

Max Context Length1,024

Tied EmbeddingsYes

Attention Dropout10.0%

Activation Functiongelu_new

RMSNorm Epsilon1e-05

Residual Dropout10.0%

Embedding Dropout10.0%

BOS Token ID50,256

EOS Token ID50256

Pad Token IDNot set

Model DtypeNot set

Layer Types:

Attention

MLP/FFN

Normalization

Embedding