Model Architecture: openai-community/gpt2

📊 Model Parameters

Total Parameters 163,037,184

Context Length 1,024

Hidden Size 768

Layers 12

Attention Heads 12

KV Heads 12

FP32 (Full) 621.9 MB

FP16 (Half) 311.0 MB

INT8 (Quantized) 155.5 MB

INT4 (Quantized) 77.7 MB

Per Token (FP16) 36.86 KB

Max Context FP32 72.0 MB

Max Context FP16 36.0 MB

Max Context INT8 18.0 MB

Vocabulary Size50,257

Number of Layers12

Attention Heads12

FFN Intermediate SizeNot set

Max Context Length1,024

Tied EmbeddingsYes

Attention Dropout10.0%

Activation Functiongelu_new

RMSNorm Epsilon1e-05

Residual Dropout10.0%

Embedding Dropout10.0%

BOS Token ID50,256

EOS Token ID50256

Pad Token IDNot set

Model DtypeNot set

Layer Types:

Attention

MLP/FFN

Normalization

Embedding