google-t5/t5-small

📊 Model Parameters

Total Parameters 93,405,696
Context Length 2,048
Hidden Size 512
Layers 6
Attention Heads 8
KV Heads 8

💾 Memory Requirements

FP32 (Full) 356.3 MB
FP16 (Half) 178.2 MB
INT8 (Quantized) 89.1 MB
INT4 (Quantized) 44.5 MB

🔑 KV Cache (Inference)

Per Token (FP16) 12.29 KB
Max Context FP32 48.0 MB
Max Context FP16 24.0 MB
Max Context INT8 12.0 MB

⚙️ Model Configuration

Core Architecture

Vocabulary Size32,128
FFN Intermediate Size2,048
Number of Layers6
Attention Heads8

Context & Position

Max Context Length512

Attention Configuration

Tied EmbeddingsYes

Activation & Normalization

RMSNorm Epsilon1e-06
Activation Functionrelu

Dropout (Training)

Hidden Dropout10.0%

Special Tokens

BOS Token IDNot set
Pad Token ID0
EOS Token ID1

Data Type

Model DtypeNot set
Layer Types:
Attention
MLP/FFN
Normalization
Embedding