Tags

11 pages

Large Model

Alibaba Large Model Strategy

Alibaba (Ali) has released numerous large models, not simply a matter of “volume chasing,” but a carefully planned “Model-as-a-Service” (MaaS) ecosystem strategy. There are multiple considerations behind this, which can be summarized as “internal empowerment and external ecosystem building.”

Internal Business Driven (Inward Empowerment)

Alibaba possesses an extremely vast and diverse business landscape, including e-commerce (Taobao & Tmall), finance (Ant Financial), logistics (Cainiao), cloud computing (Aliyun), and entertainment (Youku), among others.

Recent Usage Experiences of Large Models

Currently, no large model stands out as particularly superior; each company has its own strengths and preferred use cases.

Technical Documentation

For feeding code or asking IT technical questions: ChatGPT and Gemini

Write Code

Gather requirements and request code modifications: Claude

Blog Translation Project Musings: Historical Conversations

The initial design of the blog translation project was overly complex – first parsing Markdown format, then using placeholders to protect the content, and finally sending it to a large model for translation. This was entirely unnecessary; large models inherently possess the ability to recognize Markdown syntax and can directly process the original content while maintaining formatting during translation.

Our work shifted from debugging code to debugging the prompting of the model. Model: google/gemma-3-4b Hardware: Nvidia 3060 12GB Indeed, we chose a non-thinking model – thinking models were inefficient when executing translation tasks. We compared the performance of 4b and 12b parameters, and for translation purposes, gemma3’s 4b parameter was sufficient; there was no significant advantage in terms of 12b parameters. 12b parameter speed: 11.32 tok/sec , 4b parameter speed: 75.21 tok/sec.