← All Models
|
Qwen - Alibaba Cloud's multilingual LLM series:
Qwen1.5-0.5B
Qwen1.5-1.8B
Qwen1.5-4B
Qwen1.5-7B
Qwen1.5-14B
Qwen1.5-32B
Qwen1.5-72B
Qwen1.5-110B
Qwen1.5-MoE-A2.7B
Qwen2-0.5B
Qwen2-1.5B
Qwen2-7B
Qwen2-57B-A14B
Qwen2-72B
Qwen2.5-0.5B
Qwen2.5-1.5B
Qwen2.5-3B
Qwen2.5-7B
Qwen2.5-14B
Qwen2.5-32B
Qwen2.5-72B
Qwen3-0.6B
Qwen3-1.7B
Qwen3-4B
Qwen3-8B
Qwen3-14B
Qwen3-32B
Qwen3-30B-A3B
Qwen3-235B-A22B
Qwen/Qwen3-235B-A22B
📊 Model Parameters
Total Parameters
235,093,634,560
Context Length
40,960
Hidden Size
4096
Layers
94
Attention Heads
64
KV Heads
4
💾 Memory Requirements
FP32 (Full)
875.79 GB
FP16 (Half)
437.90 GB
INT8 (Quantized)
218.95 GB
INT4 (Quantized)
109.47 GB
🔑 KV Cache (Inference)
Per Token (FP16)
192.51 KB
Max Context FP32
14.69 GB
Max Context FP16
7.34 GB
Max Context INT8
3.67 GB
⚙️ Model Configuration
Core Architecture
Vocabulary Size
151,936
Hidden Size
4,096
FFN Intermediate Size
12,288
Number of Layers
94
Attention Heads
64
KV Heads
4
Head Dimension
128
Context & Position
Max Context Length
40,960
Uses Sliding Window
No
Sliding Window Size
Not set
RoPE Base Frequency
1000000.0
RoPE Scaling
Not set
Window Attention Layers
94
Attention Configuration
Attention Bias
No
Attention Dropout
0%
Tied Embeddings
No
Mixture of Experts
MoE Layer Frequency
1
Expert FFN Size
1,536
Experts per Token
8
Number of Experts
128
Normalize TopK Probabilities
Yes
Activation & Normalization
Activation Function
silu
RMSNorm Epsilon
1e-06
Special Tokens
BOS Token ID
151,643
Pad Token ID
Not set
EOS Token ID
151645
Data Type
Model Dtype
bfloat16
Layer Types:
Attention
MLP/FFN
Normalization
Embedding
Attention
MLP
Norm
Embedding
Clear
Expand All
Collapse All