Troubleshooting TCP Communication Abnormalities in Backend Services

The business model involves backend services establishing a connection with the group’s market gateway via TCP. Each connection requires sending an authorization request first, followed by continuously sending heartbeat packets to maintain the connection status. However, one day, we received an alert message indicating a service disconnection. After carefully checking the logs, we discovered that the backend service was continuously sending heartbeat packets, but there was no response from the other party, yet the connection never disconnected.

Brief description of the scene

I was originally working overtime at the company to push forward project progress when an alarm message suddenly popped up in the work group. At first glance, I thought it was just the usual issue – likely a network timeout causing heartbeat failures and subsequently disconnecting the service. However, after carefully checking the logs, I found that the actual situation was not like that. The backend had sent authorization login messages, but received no response. Meanwhile, heartbeats continued to be sent incessantly, yet the other party never replied with any heartbeat data. In-depth analysis of the logs revealed the following key issues:

  1. Authorization message received no response: It is very likely that the other party’s system is restarting, preventing the authorization message from being processed in a timely manner
  2. The heartbeat data was sent even though authorization failed: After investigation, we found a flaw in the program logic. The judgment logic of the heartbeat sending function is flawed; it only checks the connection status but overlooks the authorization status check.
  3. If the service can be disconnected, it will trigger a reconnection mechanism and resend the authorization message

Currently, there remains one last urgent issue that needs resolving—why the connection has not been disconnected. Solving this problem requires more in-depth and detailed troubleshooting work.

Analyzing network packets

tcpdump is a very powerful network packet capture tool that can be used to capture network data packets. By analyzing these network data packets, we can gain a more intuitive understanding of the details of network communication. Here, we can use tcpdump to capture network data packets for further analysis.

tcpdump

Analyzing the data in the graph, I can see that the heartbeat is consistently being sent, but the other server isn’t responding with any data, yet it’s sending an ACK. This prevents the connection from disconnecting on its own.

Common Flag Explanations

In the TCP protocol, PSH (Push) and ACK (Acknowledgment) are two important flags used to control data transmission and flow confirmation. Their functions are as follows:


1. PSH(Push Flag)

  • Features The purpose of the PSH flag is to request that the receiver immediately push data from the buffer to the upper layer application (instead of waiting for the buffer to fill up). This means that once a data segment with the PSH flag is received, the receiver will process and pass it to the application as quickly as possible, rather than storing it in the operating system buffer.

  • Typical Scenarios

    • HTTP/HTTPS requests: When a client sends a request (such as GET /index.html), it sets the PSH flag, hoping that the server will respond immediately
    • The SSH protocol: Each keyboard input triggers a PSH, ensuring that input characters are transmitted in real-time
    • Real-time communication: Low-latency scenarios such as video streams and online games may use PSH to reduce latency
  • Note:

    • PSH is not mandatory; the receiving party can choose to ignore this flag (but still needs to process the data normally)
    • The sender may not set the PSH, in which case the receiver will decide when to push data based on its own buffering strategy

2. ACK(Acknowledgment Flag)

  • Features The ACK flag indicates that the preceding segment of data has been received correctly. Each ACK contains an acknowledgment number (Acknowledgment Number), which represents the next expected byte sequence number. It is a core mechanism for reliable transmission in TCP.

  • Working principle:

    • When the sender sends a data segment, it carries the expected receiver’s ACK value (for example, ACK = sequence number + data length)
    • Upon receiving data, the receiver generates an ACK segment confirming the received sequence number
    • The sender will only retransmit unacknowledged data after receiving the corresponding ACK
  • Example

    • If the sender sends a data segment with sequence number 100~199, the expected ACK from the receiver should be 200
    • If the receiving party fails to receive some of the data within the range of 100~199, it will inform the sending party to retransmit via ACK=150

The combination of PSH and ACK

In TCP packets, PSH and ACK can appear simultaneously, commonly seen in the following scenarios:

  • HTTP request response When the client sends a POST request (with data), it sets PSH and ACK (acknowledgment of previous responses)

    Client → Server: SYN, ACK=1 → 建立连接
    Client → Server: PSH, ACK=1, 数据 → 发送请求数据
    Server → Client: PSH, ACK=数据长度+1 → 返回响应
    
  • Transmit commands after SSH handshake After the client enters a command, it sends a data segment with PSH and ACK to ensure that the command is immediately transmitted and processed by the server


Other flag bit associations

Flag Name Brief Description
SYN Synchronization Initialization Connection (Three-Way Handshake)
FIN End Gracefully close connection
Reset Force connection termination (abnormal situation)
Mark urgent pointer (rarely used)

Summary

  • PSH focuses on getting data to the application layer as quickly as possible, reducing latency
  • ACK focuses on reliable data transmission, avoiding packet loss or out-of-order delivery

They work together to balance the efficiency and reliability of the TCP protocol

A financial IT programmer's tinkering and daily life musings
Built with Hugo
Theme Stack designed by Jimmy