Community Project Profile
Show-o
A single-Transformer model for unified multimodal understanding and generation
- Organisation
- NUS Show Lab
- Group
- University / research
- Category
- Multimodal understanding and generation model
- Status
- Active research line
- Started
- 2024-08
- Language / Form
- Python / Models
- License
- Apache-2.0
- GitHub Stars
- 1,923
- Updated
- 2026-05-04
Show-o is a multimodal foundation-model line from NUS Show Lab: one Transformer handles both image understanding and image generation instead of splitting the two capabilities into separate systems.
What It Is
Show-o aims to unify multimodal understanding and generation. It places visual understanding, text-conditioned generation, image generation, and related capabilities inside one model framework, reducing the split between "understanding models" and "generation models."
Show Lab later continued this line with Show-o2, extending the approach toward stronger generation and understanding.
AI Relevance
Multimodal models are moving from stitched systems toward unified architectures. Show-o’s question is direct: if one model can both understand and generate images, many interactive design, editing, visual QA, and content-production workflows become more natural.
That makes it an important direction in open multimodal research.
Singapore Relevance
Show-o places NUS Show Lab on the global map of open multimodal research. For Singapore, it is a sample of a university lab exporting frontier models, not a government programme or enterprise application.
Future tracking should cover Show Lab’s model series, paper acceptances, Hugging Face usage, and whether the work turns into production tools.
Milestones
- 2024-08Show-o repository created
- 2025-01Show-o accepted to ICLR 2025
- 2025-09Show-o2 accepted to NeurIPS 2025