Transformer Architecture Visualizer

EleutherAI - Non-profit AI research lab focused on open-source LLMs

GPT - OpenAI's Generative Pre-trained Transformers

whisper-base

openai/whisper-base

openai/whisper-medium

✓ Ready

whisper-large-v2

openai/whisper-large-v2

✓ Ready

whisper-large-v3

openai/whisper-large-v3

✓ Ready

whisper-large-v3-turbo

openai/whisper-large-v3-turbo

✓ Ready

openai-gpt

openai-community/openai-gpt

✓ Ready

gpt2

openai-community/gpt2

✓ Ready

gpt2-medium

openai-community/gpt2-medium

✓ Ready

gpt2-large

openai-community/gpt2-large

✓ Ready

gpt2-xl

openai-community/gpt2-xl

✓ Ready

Llama - Meta's Large Language Model Meta AI

meta-llama/Llama-2-7b-hf

✓ Ready

Llama-2-13b-hf

meta-llama/Llama-2-13b-hf

✓ Ready

Llama-2-70b-hf

meta-llama/Llama-2-70b-hf

✓ Ready

Meta-Llama-3-8B

meta-llama/Meta-Llama-3-8B

✓ Ready

Meta-Llama-3-70B

meta-llama/Meta-Llama-3-70B

✓ Ready

Llama-3.1-405B

meta-llama/Llama-3.1-405B

✓ Ready

T5 - Google's Text-to-Text Transfer Transformer

✓ Ready

Gemma - Google's lightweight open models built from Gemini research

google/gemma-3-12b-pt

✓ Ready

gemma-3-27b-pt

google/gemma-3-27b-pt

✓ Ready

translategemma-4b-it

google/translategemma-4b-it

✓ Ready

translategemma-12b-it

google/translategemma-12b-it

✓ Ready

translategemma-27b-it

google/translategemma-27b-it

✓ Ready

Qwen - Alibaba Cloud's multilingual LLM series

Qwen/Qwen1.5-MoE-A2.7B

Qwen3-235B-A22B-Thinking-2507

Qwen/Qwen3-235B-A22B-Thinking-2507

✓ Ready

Qwen3-Next-80B-A3B-Thinking

Qwen/Qwen3-Next-80B-A3B-Thinking

✓ Ready

Deepseek - Chinese AI lab known for efficient MoE architectures

DeepSeek-V2

deepseek-ai/DeepSeek-V2

✓ Ready

DeepSeek-R1

deepseek-ai/DeepSeek-R1

✓ Ready

Other Chinese/Asian AI Labs

Seed-OSS-36B-Base

ByteDance-Seed/Seed-OSS-36B-Base

✓ Ready

Seed-OSS-36B-Base-woSyn

ByteDance-Seed/Seed-OSS-36B-Base-woSyn

✓ Ready

Kimi-K2-Thinking

moonshotai/Kimi-K2-Thinking

meituan-longcat/LongCat-Flash-Chat

zai-org/GLM-4.6V-Flash

zai-org/GLM-4.7-Flash

✓ Ready

MiMo-7B-Base

XiaomiMiMo/MiMo-7B-Base

✓ Ready

dots.llm1.base

rednote-hilab/dots.llm1.base

MiniMaxAI/MiniMax-M2.1

✓ Ready

LongCat-Flash-Thinking-2601

meituan-longcat/LongCat-Flash-Thinking-2601

✓ Ready

Microsoft Phi - Small language models with strong reasoning

Phi-3-mini-4k-instruct

microsoft/Phi-3-mini-4k-instruct

✓ Ready

Phi-4

microsoft/Phi-4

✓ Ready

Allen AI OLMo - Fully open language models with training data/code

OLMo-2-1124-7B

allenai/OLMo-2-1124-7B

✓ Ready

OLMo-2-1124-13B

allenai/OLMo-2-1124-13B

✓ Ready

Olmo-3-1025-7B

allenai/Olmo-3-1025-7B

✓ Ready

Olmo-3-1125-32B

allenai/Olmo-3-1125-32B

✓ Ready

Mistral - French AI startup known for efficient open models (founded 2023)

Mistral-7B-v0.1

mistralai/Mistral-7B-v0.1

✓ Ready

Mistral-Nemo-Base-2407

mistralai/Mistral-Nemo-Base-2407

✓ Ready

Codestral-22B-v0.1

mistralai/Codestral-22B-v0.1

✓ Ready

Mistral-Small-24B-Base-2501

mistralai/Mistral-Small-24B-Base-2501

✓ Ready

Ministral-8B-Instruct-2410

mistralai/Ministral-8B-Instruct-2410

✓ Ready

Mistral-Large-Instruct-2411

mistralai/Mistral-Large-Instruct-2411

✓ Ready

Mixtral-8x7B-v0.1

mistralai/Mixtral-8x7B-v0.1

✓ Ready

Mixtral-8x22B-v0.1

mistralai/Mixtral-8x22B-v0.1

✓ Ready

Mamba-Codestral-7B-v0.1

mistralai/Mamba-Codestral-7B-v0.1

✓ Ready

Pixtral-12B-Base-2409

mistralai/Pixtral-12B-Base-2409

✓ Ready

Voxtral-Mini-3B-2507

mistralai/Voxtral-Mini-3B-2507

✓ Ready

Voxtral-Small-24B-2507

mistralai/Voxtral-Small-24B-2507

✓ Ready

Mistral-Small-3.2-24B-Instruct-2506

mistralai/Mistral-Small-3.2-24B-Instruct-2506

✓ Ready

Ministral-3-3B-Base-2512

mistralai/Ministral-3-3B-Base-2512

✓ Ready

Ministral-3-8B-Base-2512

mistralai/Ministral-3-8B-Base-2512

✓ Ready

Ministral-3-14B-Base-2512

mistralai/Ministral-3-14B-Base-2512

✓ Ready

Mistral-Large-3-675B-Base-2512

mistralai/Mistral-Large-3-675B-Base-2512

✓ Ready

Devstral-Small-2-24B-Instruct-2512

mistralai/Devstral-Small-2-24B-Instruct-2512

✓ Ready

Devstral-2-123B-Instruct-2512

mistralai/Devstral-2-123B-Instruct-2512

✓ Ready

Grok - xAI (Elon Musk's AI company)

grok-2

xai-org/grok-2

✓ Ready

Gigachat

GigaChat3-10B-A1.8B

ai-sage/GigaChat3-10B-A1.8B

✓ Ready

architecture

Trinity-Nano-Base

arcee-ai/Trinity-Nano-Base

✓ Ready

Trinity-Mini

arcee-ai/Trinity-Mini

✓ Ready

Trinity-Large-TrueBase

arcee-ai/Trinity-Large-TrueBase

✓ Ready

stepfun

Step-3.5-Flash

stepfun-ai/Step-3.5-Flash

✓ Ready

🔍 Transformer Architecture Visualizer

EleutherAI - Non-profit AI research lab focused on open-source LLMs

GPT - OpenAI's Generative Pre-trained Transformers

Llama - Meta's Large Language Model Meta AI

T5 - Google's Text-to-Text Transfer Transformer

Gemma - Google's lightweight open models built from Gemini research

Qwen - Alibaba Cloud's multilingual LLM series

Deepseek - Chinese AI lab known for efficient MoE architectures

Other Chinese/Asian AI Labs

Microsoft Phi - Small language models with strong reasoning

Allen AI OLMo - Fully open language models with training data/code

Mistral - French AI startup known for efficient open models (founded 2023)

Grok - xAI (Elon Musk's AI company)

Gigachat

architecture

stepfun

📊 Statistics