🔍 Transformer Architecture Visualizer
Interactive visualizations of transformer model architectures
EleutherAI - Non-profit AI research lab focused on open-source LLMs
gpt-j-6b
EleutherAI/gpt-j-6b
✓ Ready
gpt-neo-125m
EleutherAI/gpt-neo-125m
✓ Ready
gpt-neo-1.3B
EleutherAI/gpt-neo-1.3B
✓ Ready
gpt-neo-2.7B
EleutherAI/gpt-neo-2.7B
✓ Ready
gpt-neox-20b
EleutherAI/gpt-neox-20b
✓ Ready
comma-v0.1-2t
common-pile/comma-v0.1-2t
✓ Ready
GPT - OpenAI's Generative Pre-trained Transformers
whisper-base
openai/whisper-base
✓ Ready
whisper-tiny
openai/whisper-tiny
✓ Ready
whisper-small
openai/whisper-small
✓ Ready
whisper-medium
openai/whisper-medium
✓ Ready
whisper-large-v2
openai/whisper-large-v2
✓ Ready
whisper-large-v3
openai/whisper-large-v3
✓ Ready
whisper-large-v3-turbo
openai/whisper-large-v3-turbo
✓ Ready
openai-gpt
openai-community/openai-gpt
✓ Ready
gpt2
openai-community/gpt2
✓ Ready
gpt2-medium
openai-community/gpt2-medium
✓ Ready
gpt2-large
openai-community/gpt2-large
✓ Ready
gpt2-xl
openai-community/gpt2-xl
✓ Ready
gpt-oss-20b
openai/gpt-oss-20b
✓ Ready
gpt-oss-120b
openai/gpt-oss-120b
✓ Ready
Llama - Meta's Large Language Model Meta AI
llama-65b
huggyllama/llama-65b
✓ Ready
llama-30b
huggyllama/llama-30b
✓ Ready
llama-13b
huggyllama/llama-13b
✓ Ready
llama-7b
huggyllama/llama-7b
✓ Ready
Llama-2-7b-hf
meta-llama/Llama-2-7b-hf
✓ Ready
Llama-2-13b-hf
meta-llama/Llama-2-13b-hf
✓ Ready
Llama-2-70b-hf
meta-llama/Llama-2-70b-hf
✓ Ready
Meta-Llama-3-8B
meta-llama/Meta-Llama-3-8B
✓ Ready
Meta-Llama-3-70B
meta-llama/Meta-Llama-3-70B
✓ Ready
Llama-3.1-405B
meta-llama/Llama-3.1-405B
✓ Ready
T5 - Google's Text-to-Text Transfer Transformer
t5-small
google-t5/t5-small
✓ Ready
t5-base
google-t5/t5-base
✓ Ready
t5-large
google-t5/t5-large
✓ Ready
Gemma - Google's lightweight open models built from Gemini research
gemma-2b
google/gemma-2b
✓ Ready
gemma-7b
google/gemma-7b
✓ Ready
gemma-2-2b
google/gemma-2-2b
✓ Ready
gemma-2-9b
google/gemma-2-9b
✓ Ready
gemma-2-27b
google/gemma-2-27b
✓ Ready
gemma-3-270m
google/gemma-3-270m
✓ Ready
gemma-3-1b-pt
google/gemma-3-1b-pt
✓ Ready
gemma-3-4b-pt
google/gemma-3-4b-pt
✓ Ready
gemma-3-12b-pt
google/gemma-3-12b-pt
✓ Ready
gemma-3-27b-pt
google/gemma-3-27b-pt
✓ Ready
Qwen - Alibaba Cloud's multilingual LLM series
Qwen1.5-0.5B
Qwen/Qwen1.5-0.5B
✓ Ready
Qwen1.5-1.8B
Qwen/Qwen1.5-1.8B
✓ Ready
Qwen1.5-4B
Qwen/Qwen1.5-4B
✓ Ready
Qwen1.5-7B
Qwen/Qwen1.5-7B
✓ Ready
Qwen1.5-14B
Qwen/Qwen1.5-14B
✓ Ready
Qwen1.5-32B
Qwen/Qwen1.5-32B
✓ Ready
Qwen1.5-72B
Qwen/Qwen1.5-72B
✓ Ready
Qwen1.5-110B
Qwen/Qwen1.5-110B
✓ Ready
Qwen1.5-MoE-A2.7B
Qwen/Qwen1.5-MoE-A2.7B
✓ Ready
Qwen2-0.5B
Qwen/Qwen2-0.5B
✓ Ready
Qwen2-1.5B
Qwen/Qwen2-1.5B
✓ Ready
Qwen2-7B
Qwen/Qwen2-7B
✓ Ready
Qwen2-57B-A14B
Qwen/Qwen2-57B-A14B
✓ Ready
Qwen2-72B
Qwen/Qwen2-72B
✓ Ready
Qwen2.5-0.5B
Qwen/Qwen2.5-0.5B
✓ Ready
Qwen2.5-1.5B
Qwen/Qwen2.5-1.5B
✓ Ready
Qwen2.5-3B
Qwen/Qwen2.5-3B
✓ Ready
Qwen2.5-7B
Qwen/Qwen2.5-7B
✓ Ready
Qwen2.5-14B
Qwen/Qwen2.5-14B
✓ Ready
Qwen2.5-32B
Qwen/Qwen2.5-32B
✓ Ready
Qwen2.5-72B
Qwen/Qwen2.5-72B
✓ Ready
Qwen3-0.6B
Qwen/Qwen3-0.6B
✓ Ready
Qwen3-1.7B
Qwen/Qwen3-1.7B
✓ Ready
Qwen3-4B
Qwen/Qwen3-4B
✓ Ready
Qwen3-8B
Qwen/Qwen3-8B
✓ Ready
Qwen3-14B
Qwen/Qwen3-14B
✓ Ready
Qwen3-32B
Qwen/Qwen3-32B
✓ Ready
Qwen3-30B-A3B
Qwen/Qwen3-30B-A3B
✓ Ready
Qwen3-235B-A22B
Qwen/Qwen3-235B-A22B
✓ Ready
Deepseek - Chinese AI lab known for efficient MoE architectures
DeepSeek-V2
deepseek-ai/DeepSeek-V2
✓ Ready
DeepSeek-R1
deepseek-ai/DeepSeek-R1
✓ Ready
Other Chinese/Asian AI Labs
Seed-OSS-36B-Base
ByteDance-Seed/Seed-OSS-36B-Base
✓ Ready
Seed-OSS-36B-Base-woSyn
ByteDance-Seed/Seed-OSS-36B-Base-woSyn
✓ Ready
Kimi-K2-Thinking
moonshotai/Kimi-K2-Thinking
✓ Ready
LongCat-Flash-Chat
meituan-longcat/LongCat-Flash-Chat
✓ Ready
GLM-4.5
zai-org/GLM-4.5
✓ Ready
GLM-4.5-Air
zai-org/GLM-4.5-Air
✓ Ready
GLM-4.6
zai-org/GLM-4.6
✓ Ready
GLM-4.6V
zai-org/GLM-4.6V
✓ Ready
GLM-4.6V-Flash
zai-org/GLM-4.6V-Flash
✓ Ready
dots.llm1.base
rednote-hilab/dots.llm1.base
✓ Ready
Microsoft Phi - Small language models with strong reasoning
phi-1
microsoft/phi-1
✓ Ready
phi-1_5
microsoft/phi-1_5
✓ Ready
phi-2
microsoft/phi-2
✓ Ready
Phi-3-mini-4k-instruct
microsoft/Phi-3-mini-4k-instruct
✓ Ready
Phi-4
microsoft/Phi-4
✓ Ready
Allen AI OLMo - Fully open language models with training data/code
OLMo-2-1124-7B
allenai/OLMo-2-1124-7B
✓ Ready
OLMo-2-1124-13B
allenai/OLMo-2-1124-13B
✓ Ready
Olmo-3-1025-7B
allenai/Olmo-3-1025-7B
✓ Ready
Olmo-3-1125-32B
allenai/Olmo-3-1125-32B
✓ Ready
Mistral - French AI startup known for efficient open models (founded 2023)
Mistral-7B-v0.1
mistralai/Mistral-7B-v0.1
✓ Ready
Mistral-Nemo-Base-2407
mistralai/Mistral-Nemo-Base-2407
✓ Ready
Codestral-22B-v0.1
mistralai/Codestral-22B-v0.1
✓ Ready
Mistral-Small-24B-Base-2501
mistralai/Mistral-Small-24B-Base-2501
✓ Ready
Ministral-8B-Instruct-2410
mistralai/Ministral-8B-Instruct-2410
✓ Ready
Mistral-Large-Instruct-2411
mistralai/Mistral-Large-Instruct-2411
✓ Ready
Mixtral-8x7B-v0.1
mistralai/Mixtral-8x7B-v0.1
✓ Ready
Mixtral-8x22B-v0.1
mistralai/Mixtral-8x22B-v0.1
✓ Ready
Mamba-Codestral-7B-v0.1
mistralai/Mamba-Codestral-7B-v0.1
✓ Ready
Pixtral-12B-Base-2409
mistralai/Pixtral-12B-Base-2409
✓ Ready
Voxtral-Mini-3B-2507
mistralai/Voxtral-Mini-3B-2507
✓ Ready
Voxtral-Small-24B-2507
mistralai/Voxtral-Small-24B-2507
✓ Ready
Mistral-Small-3.2-24B-Instruct-2506
mistralai/Mistral-Small-3.2-24B-Instruct-2506
✓ Ready
Ministral-3-3B-Base-2512
mistralai/Ministral-3-3B-Base-2512
✓ Ready
Ministral-3-8B-Base-2512
mistralai/Ministral-3-8B-Base-2512
✓ Ready
Ministral-3-14B-Base-2512
mistralai/Ministral-3-14B-Base-2512
✓ Ready
Mistral-Large-3-675B-Base-2512
mistralai/Mistral-Large-3-675B-Base-2512
✓ Ready
Devstral-Small-2-24B-Instruct-2512
mistralai/Devstral-Small-2-24B-Instruct-2512
✓ Ready
Devstral-2-123B-Instruct-2512
mistralai/Devstral-2-123B-Instruct-2512
✓ Ready
Grok - xAI (Elon Musk's AI company)
grok-2
xai-org/grok-2
✓ Ready
Gigachat
GigaChat3-10B-A1.8B
ai-sage/GigaChat3-10B-A1.8B
✓ Ready
📊 Statistics
13
Model Groups
114
Total Models
114
Ready to View