The current wave of model competition has escalated to pricing and chips.

Scrolling through the model updates tonight was genuinely mind-boggling.

My current judgment is straightforward: this round is no longer merely a wave of model releases. It involves three fronts working simultaneously—model capabilities, API pricing, and chip stack ownership. Anyone who focuses on only one of these aspects will likely have a biased view. And it is precisely because these three dimensions are intertwining that the large model sector appears so intensely competitive.

OpenAI raised the bar/high end pricing first

What is most striking about this isn’t the release of GPT-5.5 itself, but rather its pricing signal.

OpenAI’s official pricing page for 2026-04-23 clearly states that the GPT-5.5 standard tier input is $5 / 1M tokens, and output is $30 / 1M tokens; while GPT-5.4 is $2.5 / 1M and $15 / 1M. This is not “almost doubled”; this is direct doubling.

In my view, this action is not simply about raising prices. It is more like openly declaring something: GPT-5.5 does not want to occupy the position of “something everyone can casually try,” but rather the premium tier for high-value, high cost-of-failure, and highly complex tasks. By placing the emphasis on agentic coding, computer use, knowledge work, and early scientific research on its release page, OpenAI is essentially saying: if you genuinely want to integrate the model into the most critical layer of your workflow, you must accept a higher unit price.

To be honest, this step is very bold/aggressive. (Alternative options depending on context: Seriously, that’s a huge move. / Honestly, this is powerful.)

Because once leading vendors proactively raise high-end pricing, the market is no longer simply about “who is stronger,” but quickly shifts to being about “whose strength justifies this price.” This gives other players a very practical angle of attack: you don’t necessarily have to comprehensively surpass OpenAI; just delivering strong cost-performance in high-frequency scenarios like coding, Agents, and long task stability will be enough to pose a significant challenge.

This wave of local innovation is no longer simply catching up

If you only interpret the recent developments as “everyone is following OpenAI releasing new models,” that kind of underestimates the intensity of this wave.

On the official website and technical blog for Kimi K2.6, the focus is no longer on simple chatting, but rather on long-context coding, Agent Swarm, and continuous execution. In the examples provided by Moonshot itself, the model can run over 4000+ tool calls and execute continuously for more than 12 hours; the publicly listed prices on the API platform homepage are: cache $0.16 / MTok, input $0.95 / MTok, and output $4.00 / MTok. While you cannot completely trust vendor self-test results, this pricing posture is very clear: it is not trying to win with the highest premium; it is aiming for the agentic coding layer.

The situation is similar for Xiaomi as well. The MiMo-V2.5 and MiMo-V2.5-Pro, released on 2026-04-22, emphasize not just conversational feel, but rather agentic capability, multimodality, and long horizon coherence. MiMo-V2.5 directly placed 1M context, multimodality, and Agents on the first screen of the homepage; V2.5-Pro, meanwhile, emphasizes public beta launch, no price change, but continues to raise the bar for complex software engineering and long-task consistency. This signal is very important: what new models are competing for now has defaulted to “Can it be integrated into a toolchain, can it run long tasks, and can it actually get work done.”

DeepSeek is even more obvious. The V4 Preview on 2026-04-24 directly showcased 1M context, V4-Pro, V4-Flash, and day-one API availability. What’s more critical is the pricing: In the official documentation, deepseek-v4-pro costs $1.74 / 1M for input and $3.48 / 1M for output; deepseek-v4-flash is even lower, costing $0.14 / 1M for input and $0.28 / 1M for output. When you compare these prices with GPT-5.5, the impression is completely different.

This is no longer about “you having state-of-the-art models, and me having one too.”

This is:

  • Some people are pushing up the high-end price point;
  • Others are positioning towards the mid-to-high end with open capabilities, long tasks, and tool calling;
  • And some people are directly setting the price at a level you cannot ignore.

The Narrative Between DeepSeek and Huawei is Prone to Rumors Straying Off Track

Another point you mentioned, which is also the area most likely to be misunderstood/misreported tonight: whether DeepSeek V4 has actually been adapted for Huawei chips, or if we have to wait until the second half of the year.

Based on the public information I gathered, a more accurate statement would be: It is already supported now; the mention of the second half implies that costs will continue to decrease after larger-scale deployment.

DeepSeek’s official news page on 2026-04-24 states that V4 Preview was already launched and the API was available that day. On the same day, Huawei’s related public announcements also clearly stated that the entire Ascend supernode product line supports DeepSeek V4. Therefore, claiming that “full adaptation can only happen in the second half of the year” is at least inconsistent with the public information available today.

The actual meaning related to “the second half of the year” lies in another dimension: Huawei mentioned that after Ascend 950 supernodes undergo large-scale mass production and deployment in the second half of 2026, the price of V4-Pro will drop significantly. In other words, the second half is not about “just starting to function,” but rather that “the capability to function has already been established; what we need to watch next is whether the price can be cut even further after scaling up.”

The difference is quite big (or: The distinction is very significant).

The former relates to whether the technology is feasible, while the latter relates to whether commercialization can be scaled up.

I actually think that this is the signal most worth watching tonight. Because once a model of DeepSeek’s caliber genuinely begins to stably integrate with the Huawei Ascend ecosystem, the large model competition will no longer just be confined to model companies; it will become an interconnected link between training, inference, chips, clusters, and pricing structures. Whoever can make the entire pipeline run smoothly will gain greater leverage.

Why I Say the Field is Saturated

In the past, when people discussed model wars, they often still focused on leaderboards, parameters, benchmark scores, and a simple statement that “progress has been made.” Things are quite different now.

The current competition happens at least simultaneously on three layers:

For the first tier, the model itself must be powerful/capable, especially in areas like coding, Agents, and long context—capabilities that can genuinely integrate into workflows.

Second, the pricing must be justifiable. Because once the model begins handling actual production workloads, the token cost is no longer an abstract figure but a month-end bill.

The third layer is determined by which chips, cluster, or software stack it runs on, and this begins to affect its ability to continuously lower prices, ensure stable supply, and bypass geopolitical and supply chain constraints.

When all three layers compete together, that’s when it truly becomes ‘white-hot’ (or: reaches peak intensity).

So now I actually don’t want to ask which of these companies is number one anymore. That question is becoming increasingly dull/uninteresting.

The more useful questions are:

  • Are the tasks at hand worth paying for an expensive model like GPT-5.5;
  • Can your long-context coding and Agent workflows be entrusted to cheaper but significantly more aggressive performers like Kimi K2.6, MiMo-V2.5, or DeepSeek V4;
  • Does your business have hard constraints regarding domestic computing power, compliance, localized deployment, or supply chain stability?

This is the selection process in the real world.

Ultimately, the intense market heating up isn’t simply due to “more product launches.” What truly signals serious momentum is that leading companies are daring to raise prices, competitors are engaging in direct attacks, and even chip manufacturers have begun entering the fray by embracing the model roadmap. Once models, pricing, and computing power are bound together, it will be difficult for this industry to return to the stage of merely observing benchmark rankings for a passing spectacle like we saw two years ago.

In the second half of the year, I am actually more concerned about one thing: it’s not about who can continue to achieve higher scores, but rather who can simultaneously balance the three factors—capability, cost, and supply chain.

What achieves this level is not merely a passing fad. (or) This demonstrates that it is not just a flash in the pan.

References

Notes on Writing

Original Prompt

In the past two days, various companies have released many new models: ChatGPT5.5, DeepseekV4, Xiaomi 2.5, Kimi 2.6, and even ChatGPT5.5 raised its price, nearly doubling it. Rumors started circulating that DeepseekV4l is compatible with Huawei graphics cards; however, the latest news tonight reports that full compatibility will only be achievable in the second half of the year. The large model sector has completely entered a highly competitive phase.

Summary of Writing Ideas

  • Focus the core narrative on how “the competition has expanded beyond model scores to include pricing and the underlying chip/hardware stack,” rather than simply writing a chronological news report.
  • Use the continuous release timeline from 2026-04-20 to 2026-04-24 to ground the ‘feeling’ of activity/buzz into concrete dates and structure.
  • Isolate and
A financial IT programmer's tinkering and daily life musings
Built with Hugo
Theme Stack designed by Jimmy