Community Project Profile
CodeGen
Program synthesis model from the early open code-generation wave
- Organisation
- Salesforce AI Research
- Group
- International corporate lab
- Category
- Code generation model
- Status
- Classic open model
- Started
- 2022-03
- Language / Form
- Python
- License
- Apache-2.0
- GitHub Stars
- 5,176
- Updated
- 2026-05-04
CodeGen is an important sample from the early open code-generation model wave, emerging around the same period as OpenAI Codex popularized program synthesis.
What It Is
CodeGen is a family of open program-synthesis models released by Salesforce. It targets natural-language-to-code, code completion, and program generation tasks, with training and evaluation centered on code corpora.
Although today’s coding models have moved into stronger agent and IDE forms, CodeGen remains an early representative of the open code-generation path.
AI Relevance
Code generation is one of the fastest-productized areas of large models. CodeGen’s historical value is that it pulled code models from closed capability into downloadable, evaluable, modifiable research objects.
It is also an early piece in understanding the evolution of AI coding assistants.
Singapore Relevance
CodeGen is not solely a Singapore-local project, but it belongs to Salesforce AI Research’s global open-source assets and connects to the talent and research network around its Singapore lab.
On this site, it is a reminder that Singapore’s AI influence does not only come from government and universities, but also from global projects connected to multinational research teams here.
Milestones
- 2022-03CodeGen repository created