google-t5/t5-large

📊 Model Parameters

Total Parameters 803,466,240
Context Length 2,048
Hidden Size 1024
Layers 24
Attention Heads 16
KV Heads 16

💾 Memory Requirements

FP32 (Full) 2.99 GB
FP16 (Half) 1.50 GB
INT8 (Quantized) 766.2 MB
INT4 (Quantized) 383.1 MB

🔑 KV Cache (Inference)

Per Token (FP16) 98.30 KB
Max Context FP32 384.0 MB
Max Context FP16 192.0 MB
Max Context INT8 96.0 MB

⚙️ Model Configuration

Core Architecture

Vocabulary Size32,128
FFN Intermediate Size4,096
Number of Layers24
Attention Heads16

Context & Position

Max Context Length512

Attention Configuration

Tied EmbeddingsYes

Activation & Normalization

RMSNorm Epsilon1e-06
Activation Functionrelu

Dropout (Training)

Hidden Dropout10.0%

Special Tokens

BOS Token IDNot set
Pad Token ID0
EOS Token ID1

Data Type

Model DtypeNot set
Layer Types:
Attention
MLP/FFN
Normalization
Embedding