arcee-ai/Trinity-Large-TrueBase

📊 Model Parameters

Total Parameters 398,635,286,016
Context Length 8,192
Hidden Size 3072
Layers 60
Attention Heads 48
KV Heads 8

💾 Memory Requirements

FP32 (Full) 1485.03 GB
FP16 (Half) 742.52 GB
INT8 (Quantized) 371.26 GB
INT4 (Quantized) 185.63 GB

🔑 KV Cache (Inference)

Per Token (FP16) 245.76 KB
Max Context FP32 3.75 GB
Max Context FP16 1.88 GB
Max Context INT8 960.0 MB

⚙️ Model Configuration

Core Architecture

Vocabulary Size200,192
Hidden Size3,072
FFN Intermediate Size12,288
Number of Layers60
Attention Heads48
Head Dimension128
KV Heads8

Context & Position

Max Context Length8,192
RoPE Base Frequency10,000
Sliding Window Size4,096
Layer Attention Types[60 items]

Attention Configuration

Attention BiasNo
Attention Dropout0%
Tied EmbeddingsNo

Mixture of Experts

Expert FFN Size3,072
Experts per Token4
Number of Experts256
Expert Groups1
Groups per Token1

Activation & Normalization

Activation Functionsilu
RMSNorm Epsilon1e-05

Special Tokens

EOS Token IDNot set
Pad Token IDNot set
BOS Token IDNot set

Data Type

Model Dtypebfloat16
Layer Types:
Attention
MLP/FFN
Normalization
Embedding