Model Architecture: openai-community/gpt2-large

📊 Model Parameters

Total Parameters 838,359,040

Context Length 1,024

Hidden Size 1280

Layers 36

Attention Heads 20

KV Heads 20

FP32 (Full) 3.12 GB

FP16 (Half) 1.56 GB

INT8 (Quantized) 799.5 MB

INT4 (Quantized) 399.8 MB

Per Token (FP16) 184.32 KB

Max Context FP32 360.0 MB

Max Context FP16 180.0 MB

Max Context INT8 90.0 MB

Vocabulary Size50,257

Number of Layers36

Attention Heads20

FFN Intermediate SizeNot set

Max Context Length1,024

Tied EmbeddingsYes

Attention Dropout10.0%

Activation Functiongelu_new

RMSNorm Epsilon1e-05

Residual Dropout10.0%

Embedding Dropout10.0%

BOS Token ID50,256

EOS Token ID50256

Pad Token IDNot set

Model DtypeNot set

Layer Types:

Attention

MLP/FFN

Normalization

Embedding