LAVIS / BLIP

Vision-language foundation models and a one-stop library; a cornerstone of global image-text AI

GitHub stars

11.2k+

Flagship models

BLIP / BLIP-2

Direction

image-text AI

機関: Salesforce AI Research Singapore
グループ: International corporate lab
カテゴリー: Vision-language foundation models
ステータス: Classic open-source asset
ローンチ: 2022-08
言語 / 形態: Python / Jupyter Notebook
ライセンス: BSD-3-Clause
GitHub Stars: 11,239
情報更新: 2026-05-04

LAVIS / BLIP is a major contribution from Salesforce’s Singapore research team to global vision-language AI, making image-text understanding, captioning, VQA, and multimodal pretraining more reusable through open source.

説明

LAVIS stands for Library for Language-Vision Intelligence, a unified library for vision-language research and applications. BLIP and BLIP-2 are the most influential model lines within that family.

Developers can use it to load pretrained models for captioning, visual question answering, image-text retrieval, multimodal alignment, and related tasks.

AIとの関係

The BLIP family is one of the base components of multimodal AI. Many later vision-language models, data-generation pipelines, and image-text alignment studies are directly or indirectly influenced by it.

Its value is not only high citation count, but reusable code and models that lower the entry barrier for later research.

シンガポールとの関係

Salesforce’s Singapore lab proves that international corporate research teams in Singapore are not merely sales or regional offices; they can produce global foundation research.

This is an important but often underestimated layer of Singapore’s AI ecosystem: multinational labs connect local talent, global research networks, and open-source influence.

重要マイルストーン

2022
BLIP paper published at ICML 2022
2023
BLIP-2 paper published at ICML 2023

リソース入口

LAVIS on GitHub Salesforce AI Research

その他の産学研プロジェクト

Salesforce AI Research

LAVIS / BLIP

説明

AIとの関係

シンガポールとの関係

重要マイルストーン

リソース入口

その他の産学研プロジェクト

CodeGen

BAGEL

VideoLLaMA3

Sailor LLM