In the history of C++ development projects, we utilized a custom protocol for communication, which employed a two-dimensional array pattern. When processing large volumes of data, the protocol required iterating through the arrays and performing serialization operations to generate logs. Due to its low efficiency, this resulted in noticeable lag or stuttering within the system under heavy load, as reported by the business departments.
Problem Identification
When troubleshooting the issue, we first performed a performance analysis of the system and discovered that CPU utilization increased significantly when processing large amounts of data, and system response times became longer. By analyzing the system logs, we identified numerous serialization operations, which were inefficient when handling two-dimensional arrays, leading to a decline in system performance.
The pstack
tool captured thread information for the service, pinpointing that the log threads spent most of their time processing string concatenation.
This is today’s focus: different accumulation methods have significant efficiency differences. Historically, the code used the
+
operator, which frequently creates temporary objects and is very inefficient. You know it’s bad, but you don’t know how bad it is.
Demo Verification
Based on the project code, we extracted the business logic and wrote a simple demo to verify the efficiency issues of string concatenation. We compiled and ran it in Release
mode using the vs2022
compiler under windows
and the gcc8.5
compiler under linux
, comparing the efficiencies.
Key Point Explanation
The project utilized Method Four, and before obtaining test data, readers were encouraged to consider which method was most efficient and which was least efficient. I was quite surprised by the results.
- Method 1 (
+=
Concatenation): Directly concatenates each field using+=
into a string. - Method 2 (
std::ostringstream
Concatenation): Uses a stream (std::ostringstream
) to concatenate fields, which is more efficient, especially when dealing with large amounts of data. - Method 3 (Pre-allocated Memory
+=
Concatenation): Pre-allocates enough memory for the string usingreserve
, reducing the overhead of memory reallocation and improving performance. - Method 4 (
bodys = bodys + body + "\n"
): Creates a new temporary string object each time it concatenates, leading to decreased performance, particularly when dealing with large-scale concatenation due to repeated memory allocation and copying.
Referring to the results, we can see that the project inadvertently selected the least efficient method.
Furthermore, let’s analyze the optimization efficiency of different platforms and compilers. Microsoft’s visual studio
consistently performs excellently in terms of string optimization, while the gcc
compiler has somewhat lower optimization efficiency in this area.
When running the code on different machines, direct comparison between the two datasets is meaningless; instead, we can compare the differences between the various concatenation methods.
Key Points Explanation
Windows platform under VS2022 compiler
----------------------------------------
Data Generation Time: 0.054 seconds.
----------------------------------------
----------------------------------------
Data Merging Performance:
----------------------------------------
+ Data merging (+=) took: 0.053 seconds.
+ ostringstream Data merging took: 0.054 seconds.
+ Pre-reserved Data merging took: 0.045 seconds.
+ Data merging (bodys = bodys + body + "\n") took: 16.108 seconds.
----------------------------------------
Data Merging Complete.
----------------------------------------
Program finished.
Linux platform under gcc8.5 compiler
----------------------------------------
Data Generation Time: 0.108 seconds.
----------------------------------------
----------------------------------------
Data Merging Performance:
----------------------------------------
+ Data merging (+=) took: 0.100 seconds.
+ ostringstream Data merging took: 0.083 seconds.
+ Pre-reserved Data merging took: 0.057 seconds.
+ Data merging (bodys = bodys + body + "\n") took: 29.298 seconds.
----------------------------------------
Data Merging Complete.
----------------------------------------
Program finished.
#include <iostream>
#include <string>
#include <vector>
#include <random>
#include <chrono>
#include <sstream>
#include <iomanip>
typedef std::vector<std::string> DataRow;
typedef std::vector<DataRow> DataGroup;
struct ResponsePackage
{
std::string ErrorInfo;
DataRow Head;
std::string ClientId;
std::string UUID;
std::string MsgID;
std::string SessionID;
std::string ExtraInfo1;
std::string ExtraInfo2;
DataGroup DataBody;
};
// Generate specified length of random string
std::string generateRandomString(size_t length)
{
const char charset[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
const size_t max_index = sizeof(charset) - 1;
std::string random_string;
random_string.reserve(length);
std::random_device rd;
std::mt19937 generator(rd());
std::uniform_int_distribution<> distribution(0, max_index);
for (size_t i = 0; i < length; ++i)
{
random_string += charset[distribution(generator)];
}
return random_string;
}
void create_large_string()
{
// Example request package with 50 fields
ResponsePackage requestPackage;
requestPackage.Head = {
"Field1", "Field2", "Field3", "Field4", "Field5",
"Field6", "Field7", "Field8", "Field9", "Field10",
"Field11", "Field12", "Field13", "Field14", "Field15",
"Field16", "Field17", "Field18", "Field19", "Field20",
"Field21", "Field22", "Field23", "Field24", "Field25",
"Field26", "Field27", "Field28", "Field29", "Field30",
"Field31", "Field32", "Field33", "Field34", "Field35",
"Field36", "Field37", "Field38", "Field39", "Field40",
"Field41", "Field42", "Field43", "Field44", "Field45",
"Field46", "Field47", "Field48", "Field49", "Field50"
};
requestPackage.ClientId = "ClientID";
requestPackage.UUID = "UUID";
requestPackage.MsgID = "MsgID";
requestPackage.SessionID = "SessionID";
requestPackage.ExtraInfo1 = "ExtraInfo1";
requestPackage.ExtraInfo2 = "ExtraInfo2";
// Start timing for data generation
auto start_gen = std::chrono::high_resolution_clock::now();
// Generate 10,000 rows of data, each with 50 fields
for (size_t i = 0; i < 10000; ++i)
{
DataRow dataRow(50, "This is a test string");
requestPackage.DataBody.push_back(dataRow);
}
// End timing for data generation
auto end_gen = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> duration_gen = end_gen - start_gen;
// Display result generation time
std::cout << "\n----------------------------------------\n";
std::cout << "Data Generation Time: " << std::fixed << std::setprecision(3) << duration_gen.count() << " seconds.\n";
std::cout << "----------------------------------------\n";
// Data merging using different methods
std::cout << "\n----------------------------------------\n";
std::cout << "Data Merging Performance:\n";
std::cout << "----------------------------------------\n";
{
// Method 1: Using '+=' string concatenation
auto start_merge = std::chrono::high_resolution_
```markdown
## Complete Code
{
// Method 2: Using ostringstream
auto start_merge = std::chrono::high_resolution_clock::now();
std::ostringstream bodys;
for (auto& vec : requestPackage.DataBody)
{
std::ostringstream body;
body << "This is a test string";
for (auto& item : vec)
{
body << item << " ";
}
bodys << body.str() << "\n";
}
auto end_merge = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> duration_merge = end_merge - start_merge;
std::cout << "+ ostringstream Data merging took: " << std::fixed << std::setprecision(3) << duration_merge.count() << " seconds.\n";
}
{
// Method 3: Pre-allocated memory
auto start_merge = std::chrono::high_resolution_clock::now();
std::string bodys;
bodys.reserve(1000 * 50 * 20); // Pre-allocate enough memory
for (auto& vec : requestPackage.DataBody)
{
std::string body("This is a test string");
body.reserve(50 * 20); // Pre-allocate memory for each row
for (auto& item : vec)
{
body += item + " ";
}
bodys += body + "\n";
}
auto end_merge = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> duration_merge = end_merge - start_merge;
std::cout << "+ Pre-reserved Data merging took: " << std::fixed << std::setprecision(3) << duration_merge.count() << " seconds.\n";
}
{
// Method 4: Using 'bodys = bodys + body + "\n"'
auto start_merge = std::chrono::high_resolution_clock::now();
std::string bodys("");
for (auto& vec : requestPackage.DataBody)
{
std::string body("This is a test string");
for (auto& item : vec)
{
body = body + item + " "; // Note the use of 'body = body + item'
}
bodys = bodys + body + "\n"; // Again, using 'bodys = bodys + body'
}
auto end_merge = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> duration_merge = end_merge - start_merge;
std::cout << "+ Data merging (bodys = bodys + body + \"\\n\") took: " << std::fixed << std::setprecision(3) << duration_merge.count() << " seconds.\n";
}
std::cout << "\n----------------------------------------\n";
std::cout << "Data Merging Complete.\n";
std::cout << "----------------------------------------\n";
}
int main()
{
try
{
create_large_string();
}
catch (const std::exception& e)
{
std::cerr << "Caught exception: " << e.what() << std::endl;
}
std::cout << "\nProgram finished.\n";
return 0;
}