AI Industry & Applications · 2025-07-01 · 03:00

Semantic-aware multimodal multilingual deep learning systems for e-commerce

Speaker
AI Singapore
AI research and talent-development organisation
Type
Academic

In Brief

An AISG 100E project tackles multilingual, multimodal e-commerce settings, addressing two key challenges: scarcity of labelled data for low-resource languages and complex semantic learning.

Key Takeaways

  • An AISG 100E project with Shopee replaced manual task flows with fine-tuned LLMs, boosting Taskbot completion and correction rates.
  • The upgraded Shopee chatbot incorporates negative feedback into fine-tuning, lifting customer satisfaction and cutting failed interactions.
  • NUS trained a Southeast Asia multilingual LLM whose win rate against the GPT baseline jumped from 18% to 50%, and to 56% with on-policy sampling.

Summary

AISG and Shopee built advanced multilingual, multimodal deep learning models to handle the semantic complexity of regional e-commerce. Previously Shopee leaned on manually constructed task flows that were expensive and narrow in coverage. The team rebuilt Taskbot with fine-tuned large language models, lifting completion and correction rates well past prior benchmarks.

A second application reworked the Shopee chatbot. The earlier system used traditional machine learning fine-tuned only on successful cases. The new pipeline trains on both positive and negative feedback, raising customer satisfaction and dropping the failure rate. NUS also trained its own Southeast Asia LLM from scratch. Its core innovation, a bidirectional negative-feedback loss, enables stable preference alignment under sparse supervision. Win rate against the GPT baseline jumped from 18% to 50%, and on-policy sampling pushed it to 56%. Findings have been published at leading conferences, with selected models slated for Shopee production once mature.

Full transcript

Caption language: en · Fetched: 2026-05-02

E-commerce is experiencing exponential growth, playing an increasingly vital role in the global economy. In recent years, deep learning has been increasingly adopted across a wide range of e-commerce applications. However, existing deep learning models struggle to effectively understand and process information when it's presented in multiple languages and different modalities. In collaboration with C, we have developed advanced multilingual and multimodal models, published our findings in leading conferences, and successfully validated our models on C's platform. Previously, shopp relied heavily on manually constructed task flows, which led to high costs and limited coverage. But now, we've transformed customer interactions by harnessing the power of fine-tuned large language models.

Today, Taskbot surpasses every expectation, achieving an impressive completion rate and correction rate. The result, customer interactions are now smarter, faster, and more reliable than ever before. In another application, the shoppy chatbot originally used traditional machine learning, fine-tuning only on good cases while ignoring the failures. Our solution changes that incorporating both positive and negative feedback into LLM fine-tuning. This enhanced approach has driven a significant boost in customer satisfaction and a notable drop in failed interactions. We also trained our own large language model from scratch. Multilingual LLMs for Southeast Asia face unique challenges especially due to limited data for low resource languages. At NTUNC, we're building a model specifically tailored to this region.

Our key innovation, a birectional negative feedback loss enables stable preference alignment even with scarce supervision. This led to a significant improvement. Win rates jumped from 18% to 50% against the GPT baseline. With on policy sampling, we pushed that further to 56%. For shopppee users, this translates directly to a much smarter LLM that truly understands their needs, leading to more helpful and accurate interactions every single time they use the app.

Related Videos