ShowUI

A vision-language-action model for GUI agents and computer use

GitHub stars

1.8k+

Paper

CVPR 2025

Direction

Computer Use

機関: NUS Show Lab
グループ: University / research
カテゴリー: GUI agent model
ステータス: Active research line
ローンチ: 2024-10
言語 / 形態: Python / Models
ライセンス: Apache-2.0
GitHub Stars: 1,869
情報更新: 2026-05-04

ShowUI is an open model for GUI agents, letting a model understand interfaces from screenshots and output clickable coordinates or actions.

説明

ShowUI focuses on the software interfaces people actually use every day: webpages, app windows, buttons, input boxes, and menus. It lets a model locate action targets from visual interfaces, serving computer use and GUI automation.

This differs from pure text agents: many real applications lack clean APIs, complete DOMs, or accessibility trees. ShowUI tries to understand actions directly from the screen.

AIとの関係

One bottleneck for agent deployment is interface operation. Having a model write a plan is not the hard part; clicking the right place, understanding state changes, and recovering from failure inside complex software is harder.

ShowUI turns GUI visual understanding into a model task, making it a key path from chat agents toward real computer operation.

シンガポールとの関係

ShowUI matters for Singapore because it cuts into enterprise automation and agent tooling. Much AI deployment in Singapore happens inside finance, government, healthcare, and logistics systems, where many workflows still pass through legacy interfaces.

If GUI agents become a general capability, work such as ShowUI becomes a base module connecting models to real software workflows.

重要マイルストーン

2024-10
ShowUI repository created
2025-02
ShowUI accepted to CVPR 2025

リソース入口

ShowUI on GitHub ShowUI paper Show Lab on Hugging Face

その他の産学研プロジェクト

NUS HPC-AI Lab

ShowUI

説明

AIとの関係

シンガポールとの関係

重要マイルストーン

リソース入口

その他の産学研プロジェクト

Colossal-AI

OpenMMLab

NExT-GPT

Show-o