← All Models
|
Allen AI OLMo - Fully open language models with training data/code:
OLMo-2-1124-7B
OLMo-2-1124-13B
Olmo-3-1025-7B
Olmo-3-1125-32B
allenai/OLMo-2-1124-13B
📊 Model Parameters
Total Parameters
13,716,198,400
Context Length
4,096
Hidden Size
5120
Layers
40
Attention Heads
40
KV Heads
40
💾 Memory Requirements
FP32 (Full)
51.10 GB
FP16 (Half)
25.55 GB
INT8 (Quantized)
12.77 GB
INT4 (Quantized)
6.39 GB
🔑 KV Cache (Inference)
Per Token (FP16)
819.20 KB
Max Context FP32
6.25 GB
Max Context FP16
3.12 GB
Max Context INT8
1.56 GB
⚙️ Model Configuration
Core Architecture
Vocabulary Size
100,352
Hidden Size
5,120
FFN Intermediate Size
13,824
Number of Layers
40
Attention Heads
40
KV Heads
40
Context & Position
Max Context Length
4,096
RoPE Base Frequency
500,000
RoPE Scaling
Not set
Attention Configuration
Tied Embeddings
No
Attention Bias
No
Attention Dropout
0%
Activation & Normalization
Activation Function
silu
RMSNorm Epsilon
1e-06
Special Tokens
BOS Token ID
Not set
Pad Token ID
100,277
EOS Token ID
100257
Data Type
Model Dtype
float32
Layer Types:
Attention
MLP/FFN
Normalization
Embedding
Attention
MLP
Norm
Embedding
Clear
Expand All
Collapse All