Tags

46 pages

AI Inspiration Hub

What is this process of layering folders and then wrapping them in a namespace called?

When I recently wrote the algorithm service, as soon as I implemented modules like twap and vwap, this old problem popped up again.

If we rely on class names to enforce semantics in C++, the naming convention quickly becomes out of control. Things like TwapOrderManager, VwapOrderManager, and AlgoOrderManager sound like, “I know my structure isn’t contained, but I’ll at least add a prefix.” Frankly speaking, organizing by folders and then adding a layer of namespace isn’t about code snobbery; it’s filling the gap that C++ has because it lacks Java’s native package system.

Google has released Gemma 4 this time (III)

While browsing the forum this time, what struck me most wasn’t which company released another leaderboard, but a very basic statement: “Not enough VRAM; no matter how large the parameters are, it’s useless.”

Previously, I always understood “slow model” as a computational power issue. However, the more I read, the clearer it became that often, the problem isn’t that the GPU can’t compute it, but rather that the data cannot reside in the right place. Just by changing the memory path, the token speed doesn’t just slow down; it drops drastically.

Google released Gemma 4 this time (Part II)

If you only look at the leaderboard, 31B is definitely the most eye-catching. But when you actually get the machine out, it’s still that un-upgraded RTX 3060 12GB, and your judgment will change immediately. How should I put it? For local deployment, in the end, it’s not about who looks the fanciest, but who seems like the one you can live with long-term. For me, what is truly worth running first this time isn’t 31B, but 26B A4B.

Google has released Gemma 4 this time (Part 1)

On the day of the initial release, what I originally wanted to do was simple: find an upgraded version corresponding to Gemma 3 and download it to run. However, after looking around, I was a bit stunned. The familiar naming convention of 4B / 12B / 27B is gone; instead, we have E4B, 26B A4B, and 31B. How should I put it? This time, what Google truly changed wasn’t just the model sizes, but even “how you should understand this batch of models.”

The snack store is very busy and opened in Songjiang University Town, it's not an accident.

Staying at home most of the time, it was rare to venture out during the Qingming holiday in 2026. I wandered into Wenhui Road in Songjiang University Town, and my first thought wasn’t about the scenery, but about the shops.

Good sales are nothing new; they’ve sprung up everywhere in Shanghai over the years. What really caught me off guard was seeing 零食很忙 (Snack Busy) here. I used to see this brand more often back in my hometown and always thought it was a bit further from Shanghai. Turns out, one street, Wenhui Road, completely shattered that stereotype of mine.

My current assessment is quite clear: the fact that a store like 零食很忙 can open in Songjiang isn’t because Shanghai has suddenly “degraded” to a lower tier. Rather, it’s because Songjiang was never just the peripheral edge of Shanghai as many people imagine. If you treat this area as a suburb, it actually possesses sufficient foot traffic, a sufficiently young customer base, and enough dwell time; if you view it as merely a dormitory town, it is backed by the historical foundation of Songjiang Prefecture, the innovative resources of the university town, and its new positioning as the southwest gateway to Shanghai.

Codex defaults to medium, but I later switched to high.

During my time using Codex, there was one thing that always felt a bit awkward: the default thinking level is medium, but when chatting online about GPT-5.4, everyone’s tone is very strong. When it comes down to actually using it, what exactly is the difference between medium, high, and xhigh? The official documentation hasn’t provided a particularly straightforward chart. My current conclusion is quite clear: for daily coding, I prefer to start directly at high. Medium isn’t unusable; it’s fine for quick tasks, minor tweaks, or exploring directions. But when dealing with multi-file modifications, ambiguous requirements, and needing to judge while looking at code, medium easily wastes computational power in the wrong places. I actually don’t use xhigh often; I save it for really difficult tasks where I get stuck.

Writing an AI blog post, in the end, still needs to be turned into engineering (Part 3)

After going through all the configurations in the repository, I am even more certain about one thing: what matters in the end is not how strong any single model is, but rather who should bear the cost at each layer.

The most obvious signal is that the currently active published.runtime.json is still the one generated on April 2, 2026, for minimax-m2, yet the entry from April 3, 2026, at 16:38, labeled 5f17088, has switched the default provider for blog-style-suite to the local gemma-4-26b-a4b in LM Studio. This might look inconsistent, but it actually isn’t; it precisely illustrates that this pipeline has begun to specialize.

Making the "AI writes blog" thing into an engineering project later (Part II)

If there are enough tokens, the least effort method is actually quite crude: just feed the model historical articles and let it learn on its own. The problem with this method is that it only suits occasional writing, not continuous work. If you treat blogging as a long-term workflow, relying solely on raw historical articles will quickly go from “simple and direct” to “expensive and messy.”