Model Architecture: EleutherAI/gpt-neo-2.7B

📊 Model Parameters

Total Parameters 2,779,965,440

Context Length 2,048

Hidden Size 2560

Layers 32

Attention Heads 20

KV Heads 20

FP32 (Full) 10.36 GB

FP16 (Half) 5.18 GB

INT8 (Quantized) 2.59 GB

INT4 (Quantized) 1.29 GB

Per Token (FP16) 327.68 KB

Max Context FP32 1.25 GB

Max Context FP16 640.0 MB

Max Context INT8 320.0 MB

Vocabulary Size50,257

Hidden Size2,560

Number of Layers32

Attention Heads20

FFN Intermediate SizeNot set

Max Context Length2,048

Sliding Window Size256

Attention Dropout0%

Tied EmbeddingsYes

Activation Functiongelu_new

RMSNorm Epsilon1e-05

Residual Dropout0%

Embedding Dropout0%

BOS Token ID50,256

EOS Token ID50256

Pad Token IDNot set

Model DtypeNot set

Layer Types:

Attention

MLP/FFN

Normalization

Embedding