one loop thread, the time taken has already been at the microsecond level, switching servers resulted in a backlog of up to 60,000 packets, to almost none.
In single-threaded loop processing data scenarios, the CPU performance depends on factors such as clock frequency, cache size, and instruction set architecture. Generally, CPUs with higher clock frequencies, larger caches, and more advanced instruction set architectures perform better in single-threaded data processing.
Single-Threaded Performance improvements aren’t always achieved by adding threads; it’s not necessary to overcomplicate things.
one loop thread, the time taken has already been at the microsecond level, switching servers resulted in a backlog of up to 60,000 packets, to almost none.
In single-threaded loop processing data scenarios, the CPU performance depends on factors such as clock frequency, cache size, and instruction set architecture. Generally, CPUs with higher clock frequencies, larger caches, and more advanced instruction set architectures perform better in single-threaded data processing.
Single-Threaded
Performance improvements aren’t always achieved by adding threads; it’s not necessary to overcomplicate things. Refine the project workflow, identify time-consuming bottlenecks, and determine if a single thread can meet the requirements. Considering single-threaded approaches reduces complexity and minimizes potential issues.
It’s often a bit misguided to jump straight into suggesting threading.
Events
All processed market data, latency sensitive.
Working late into the night to release a new optimized version, local API removal for testing, speed was okay, tps: 42,000
Deployed to server, tps dropped significantly: 21,000, went home to try on a desktop, tps: 79,000, started suspecting that the internal service virtual machines might have some issues, initially suspected frequency-related problems, the difference between the home desktop and the server’s CPU is the biggest, namely the frequency.
Initially, regarding core count and clock frequency, the Intel(R) Xeon(R) CPU E7-4807 @ 1.87GHz has 6 physical cores and 12 logical cores with a clock speed of 1.87GHz; while the Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz has 8 physical cores and 16 logical cores with a clock speed of 2.60GHz. Therefore, from the perspective of core count and clock frequency, the E5-2640 v3 should be more advantageous in single-threaded data processing compared to the E7-4807.
Secondly, considering cache size, the E7-4807 has a cache size of 12MB, while the E5-2640 v3 has a cache size of 20MB. Therefore, the E5-2640 v3 possesses larger cache space, which can improve data access speed and cache hit rate.
Finally, regarding architecture, the E5-2640 v3 utilizes the more advanced Haswell architecture, while the E7-4807 employs the older Nehalem architecture. The Haswell architecture offers performance improvements compared to the Nehalem architecture, which may also influence the E5-2640 v3’s performance in single-threaded data processing.
Therefore, considering everything comprehensively, in scenarios involving single-threaded loop processing of data, the E5-2640 v3 should exhibit better performance than the E7-4807. However, specific performance differences will be influenced by various factors including data processing algorithms, memory bandwidth, system load, and so on, requiring a case-by-case analysis.