Model Architecture: EleutherAI/gpt-j-6b

📊 Model Parameters

Total Parameters 6,050,882,784

Context Length 2,048

Hidden Size 4096

Layers 28

Attention Heads 16

KV Heads 16

FP32 (Full) 22.54 GB

FP16 (Half) 11.27 GB

INT8 (Quantized) 5.64 GB

INT4 (Quantized) 2.82 GB

Per Token (FP16) 458.75 KB

Max Context FP32 1.75 GB

Max Context FP16 896.0 MB

Max Context INT8 448.0 MB

Vocabulary Size50,400

Number of Layers28

Attention Heads16

FFN Intermediate SizeNot set

Max Context Length2,048

Attention Dropout0%

Tied EmbeddingsNo

Activation Functiongelu_new

RMSNorm Epsilon1e-05

Residual Dropout0%

Embedding Dropout0%

BOS Token ID50,256

EOS Token ID50256

Model DtypeNot set

Layer Types:

Attention

MLP/FFN

Normalization

Embedding