在C++开发的历史项目中,我们使用自定义协议进行通信,协议采用了二维数组的模式。在处理大量数据时,协议内部需要遍历数组并进行序列化操作以生成日志,由于效率较低,导致了系统在高负载下出现明显的卡顿,业务部门反馈系统卡顿。
问题定位
在排查问题时,我们首先对系统进行了性能分析,发现系统在处理大量数据时,CPU 占用率明显增加,且系统响应时间变长。通过分析系统的日志,我们发现了大量的序列化操作,这些操作在处理二维数组时效率较低,导致了系统性能下降。
pstack
工具截取服务的线程信息,定位到日志线程大部分时间都在处理字符串的拼接。
这里是今天的重点,不同的累加方式,效率差别巨大。历史代码中使用的是
+
运算符,这种方式会频繁的创建临时对象,效率很低。你知道它效率很差,但你不知道它效率有多差的那种。
demo 验证
基于项目代码,我们抽离业务逻辑,编写了一个简单的 demo,用于验证字符串拼接的效率问题。windows
下的 vs2022
编译器,linux
下的 gcc8.5
编译器,Release
模式下编译运行,对比效率。
关键点说明
项目使用的是方法四,在尚未拿到测试数据的时候,读者可以先思考一下,哪种方式效率最高?哪种方式效率最低?看到结果的时候,我还是很惊讶的。
- 方法 1 (
+=
拼接):直接通过+=
将每个字段拼接到字符串中。 - 方法 2 (
std::ostringstream
拼接):使用流(std::ostringstream
)来拼接每个字段,这种方法更高效,特别是对于大量数据拼接时。 - 方法 3(预分配内存的
+=
拼接):通过reserve
提前为字符串分配足够的内存,减少了内存重新分配的开销,从而提升了性能。 - 方法 4 (
bodys = bodys + body + "\n"
):每次拼接都创建一个新的临时字符串对象,这会导致性能下降,尤其是在大规模拼接时,因为每次拼接都会涉及一次新的内存分配和复制。
参考结果,我们可以看到,项目刚好选中了效率最差的方式。
再进一步,我们来分析不同平台编译器的优化效率,微软的 visual studio
一如既往的优秀,针对字符串的优化效率很高,而 gcc
编译器在这方面的优化效率就差了一些。
代码在不同的机器执行,两份数据没有直接的对比意义,可以分别对比不同拼接方法之间的差值。
windows 平台下的 vs2022 编译器
----------------------------------------
Data Generation Time: 0.054 seconds.
----------------------------------------
----------------------------------------
Data Merging Performance:
----------------------------------------
+ Data merging (+=) took: 0.053 seconds.
+ ostringstream Data merging took: 0.054 seconds.
+ Pre-reserved Data merging took: 0.045 seconds.
+ Data merging (bodys = bodys + body + "\n") took: 16.108 seconds.
----------------------------------------
Data Merging Complete.
----------------------------------------
Program finished.
linux 平台下的 gcc8.5 编译器
----------------------------------------
Data Generation Time: 0.108 seconds.
----------------------------------------
----------------------------------------
Data Merging Performance:
----------------------------------------
+ Data merging (+=) took: 0.100 seconds.
+ ostringstream Data merging took: 0.083 seconds.
+ Pre-reserved Data merging took: 0.057 seconds.
+ Data merging (bodys = bodys + body + "\n") took: 29.298 seconds.
----------------------------------------
Data Merging Complete.
----------------------------------------
Program finished.
完整代码
#include <iostream>
#include <string>
#include <vector>
#include <random>
#include <chrono>
#include <sstream>
#include <iomanip>
typedef std::vector<std::string> DataRow;
typedef std::vector<DataRow> DataGroup;
struct ResponsePackage
{
std::string ErrorInfo;
DataRow Head;
std::string ClientId;
std::string UUID;
std::string MsgID;
std::string SessionID;
std::string ExtraInfo1;
std::string ExtraInfo2;
DataGroup DataBody;
};
// Generate specified length of random string
std::string generateRandomString(size_t length)
{
const char charset[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
const size_t max_index = sizeof(charset) - 1;
std::string random_string;
random_string.reserve(length);
std::random_device rd;
std::mt19937 generator(rd());
std::uniform_int_distribution<> distribution(0, max_index);
for (size_t i = 0; i < length; ++i)
{
random_string += charset[distribution(generator)];
}
return random_string;
}
void create_large_string()
{
// Example request package with 50 fields
ResponsePackage requestPackage;
requestPackage.Head = {
"Field1", "Field2", "Field3", "Field4", "Field5",
"Field6", "Field7", "Field8", "Field9", "Field10",
"Field11", "Field12", "Field13", "Field14", "Field15",
"Field16", "Field17", "Field18", "Field19", "Field20",
"Field21", "Field22", "Field23", "Field24", "Field25",
"Field26", "Field27", "Field28", "Field29", "Field30",
"Field31", "Field32", "Field33", "Field34", "Field35",
"Field36", "Field37", "Field38", "Field39", "Field40",
"Field41", "Field42", "Field43", "Field44", "Field45",
"Field46", "Field47", "Field48", "Field49", "Field50"
};
requestPackage.ClientId = "ClientID";
requestPackage.UUID = "UUID";
requestPackage.MsgID = "MsgID";
requestPackage.SessionID = "SessionID";
requestPackage.ExtraInfo1 = "ExtraInfo1";
requestPackage.ExtraInfo2 = "ExtraInfo2";
// Start timing for data generation
auto start_gen = std::chrono::high_resolution_clock::now();
// Generate 10,000 rows of data, each with 50 fields
for (size_t i = 0; i < 10000; ++i)
{
DataRow dataRow(50, "This is a test string");
requestPackage.DataBody.push_back(dataRow);
}
// End timing for data generation
auto end_gen = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> duration_gen = end_gen - start_gen;
// Display result generation time
std::cout << "\n----------------------------------------\n";
std::cout << "Data Generation Time: " << std::fixed << std::setprecision(3) << duration_gen.count() << " seconds.\n";
std::cout << "----------------------------------------\n";
// Data merging using different methods
std::cout << "\n----------------------------------------\n";
std::cout << "Data Merging Performance:\n";
std::cout << "----------------------------------------\n";
{
// Method 1: Using '+=' string concatenation
auto start_merge = std::chrono::high_resolution_clock::now();
std::string bodys("");
for (auto& vec : requestPackage.DataBody)
{
std::string body("This is a test string");
for (auto& item : vec)
{
body += item + " ";
}
bodys += body + "\n";
}
auto end_merge = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> duration_merge = end_merge - start_merge;
std::cout << "+ Data merging (+=) took: " << std::fixed << std::setprecision(3) << duration_merge.count() << " seconds.\n";
}
{
// Method 2: Using ostringstream
auto start_merge = std::chrono::high_resolution_clock::now();
std::ostringstream bodys;
for (auto& vec : requestPackage.DataBody)
{
std::ostringstream body;
body << "This is a test string";
for (auto& item : vec)
{
body << item << " ";
}
bodys << body.str() << "\n";
}
auto end_merge = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> duration_merge = end_merge - start_merge;
std::cout << "+ ostringstream Data merging took: " << std::fixed << std::setprecision(3) << duration_merge.count() << " seconds.\n";
}
{
// Method 3: Pre-allocated memory
auto start_merge = std::chrono::high_resolution_clock::now();
std::string bodys;
bodys.reserve(1000 * 50 * 20); // Pre-allocate enough memory
for (auto& vec : requestPackage.DataBody)
{
std::string body("This is a test string");
body.reserve(50 * 20); // Pre-allocate memory for each row
for (auto& item : vec)
{
body += item + " ";
}
bodys += body + "\n";
}
auto end_merge = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> duration_merge = end_merge - start_merge;
std::cout << "+ Pre-reserved Data merging took: " << std::fixed << std::setprecision(3) << duration_merge.count() << " seconds.\n";
}
{
// Method 4: Using 'bodys = bodys + body + "\n"'
auto start_merge = std::chrono::high_resolution_clock::now();
std::string bodys("");
for (auto& vec : requestPackage.DataBody)
{
std::string body("This is a test string");
for (auto& item : vec)
{
body = body + item + " "; // Note the use of 'body = body + item'
}
bodys = bodys + body + "\n"; // Again, using 'bodys = bodys + body'
}
auto end_merge = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> duration_merge = end_merge - start_merge;
std::cout << "+ Data merging (bodys = bodys + body + \"\\n\") took: " << std::fixed << std::setprecision(3) << duration_merge.count() << " seconds.\n";
}
std::cout << "\n----------------------------------------\n";
std::cout << "Data Merging Complete.\n";
std::cout << "----------------------------------------\n";
}
int main()
{
try
{
create_large_string();
}
catch (const std::exception& e)
{
std::cerr << "Caught exception: " << e.what() << std::endl;
}
std::cout << "\nProgram finished.\n";
return 0;
}