Review on TCP performance over HSR
Published:
This integrates several papers published in SIGCOMM, MOBICOM, INFOCOM ,etc.
Background
While high-speed rail (HSR) systems provide an efficient way of door-to-door transportation, they also pose unprecedented challenges in Internet service for the passengers-throughput severely degrades in high-speed mobility scenarios. Therefore, researchers have made effort to measure the performance of mobile data network in HSR crossing different geographic areas in order to pinpoint the throughput bottleneck, demystify TCP in high mobility environment and, most importantly, remedy the situation.
Methodology
Infrastructure-based measurements are unreliable 1. Therefore, some tools are thereby proposed, such as Mobiperf 2, MobiNet 3, and MobileInsight 4. Mobiperf is an open source application for measuring throughput and latency on mobile platforms. Compared with Mobiperf, MobiNet is more dedicated, capturing not only information of network usage, i.e., measure cellular network performance, but also RSS, information of celltowers and GPS information but I cannot find an opensource copy and the app is removed from the market. While MobiNet is custom-made, MobileInsight has a more general purpose. It serves as a tool that collects, analyzes and exploits runtime network information from operational cellular networks. In order to extract cellular operations from signaling messages without extra help, this app leverages a side channel inside smarphones-control messages that regulate utility function of radio access, service quality, etc. This app has helped a lot of research on wireless network. In the previous mentioned 1 and 4, a machine learning-based local analysis is proposed to predict link action, which indicates a powerful solution towards what will be discussed later. By the way, Peng et al. are still maintaining MobileInsight.
It is never too verbose to describe tools used by different works because they represent different research emphasis. Despite network performance being as an end-to-end concept in transport layer, some researchers conduct analysis starting from cellular systems because high speed casts passive impact on channel quality. Yuanjie Li et al.5 Gong et al. 6 acknowledge millimeter wave’s bandwidth edge over conventional LTE communication, discuss several challenges brought by high mobility and propose a method to compensate Doppler effect to improve PHY rate, which beds only in communication between celltower and UE.
With regard to experiments, researchers usually select a specific line to collect traces. For stringency, they collect a huge amount of data, including throughput, packet loss, RTT and conduct experiments to show how HSR is influencing these metrics and TCP’s adaptability to high mobility. Specially, congestion control is widely discussed. After disclosing the problems, researchers then give their solutions. However, in spite of the different aspects where researchers set off, they reach a consensus that handover is the biggest killer. Next, I will talk about how this notion is shaped in different ways.
Analysis
In general, LTE network on HSR manifests several symptoms of deterioration including packet loss and RTT spike, resulting in throughput variation.
Intuitively, the increasing mobility causes issues such as Doppler frequency offset, fast multi-path fading, and inter-channel/inter-carrier interference, which are all believed to be exacerbating channel quality. Xiao et al. 3 plot RSRP diagrams to show that signal strength is weak and fluctuating while there are even coverage holes. However, their results show that the signal strength is around good and average for the most time even if weak signal leads to low throughput, suggesting channel quality is not the main contributor to worsened network condition. Similarly, Li et al. 7 quantitively compare the impact factor of speed, which is rather minor.
Bearing that in mind, to show the degree of network degradation in different situations, most works present throughput traces labeled with handover moments to imply that it is the main factor as throughput fluctuates at a handover while remains relatively stable without it. HSR leads to frequent handovers and makes them more likely to fail. Since the connection is cut off temporarily during a handover, packets are either lost or delayed. Given that, Wang et al. conclude that handovers have both instantaneous effect and near effect from the perspective of congestion control (packet loss and retransmission) and that handovers can be divided into several types depending on end states. Likewise, Xiao et al.7 defines different situations of handovers by network type, which, however, is no longer available because LTE is widely deployed.
Now that handover is well defined, aside from the general impact brought by handovers (e.g., throughput variation), Li et al.8 measure flow with diverse sizes to investigate how handover influence content delivery differently. From the result, mice flow is troubled with sub-flow’s handover establishment while elephant flow is constrained by congestion control and scheduling to handovers. From a lower-layer perspective, in 1, researchers suggests that radio resource control is stateful and thus predictable. However, to predict such nuance action with extreme mobility is challenging as the handover messages are often lost and cannot be tracked.
In the scope of 4G LTE, some solutions are proposed to enhance TCP performance. Wang et al. 7 conduct a dual-method measurement to evaluate two popular TCP protocols (i.e., CUBIC and BBR) and propose a supplement a rule of congestion control to boost the throughput. There are also studies that discuss MPTCP. Liu et al. suggests that MPTCP could bring benefits by providing a more reliable retransmission machanism, while Li et al. 8 argues MPTCP is only advantageous at relatively worse paths.
Summary
- Network on HSR is worsened for throughput variation, yet a reasonable throughput is achievable.
- Channel quality is only a minor factor
- Handover influences throughput in two ways: sudden drop & long recovery because both TCP’s bandwidth probing strategy and RTT estimation do not adapt to dynamics in a agile way
- Handover Latency is somewhat predictable.
Discussion
Prior works expose TCP behaviors in dynamic setting, and further investigate the impact of handover that is neglected for some days. Though there are works like 9 presenting a way to lower handover latency, there is still a missing piece in the area of HSR network performance analysis.
Throughput variation is even more harmful to data consuming applications like video playback because the near effect of congestion control stimulates ABR to select lower bitrate than its capacity and thus poisoning user perception. Wang et al. show that users are discouraged to watch videos on HSR because of QoE concerns so that only 5% of traffic is video segments. But that was 4 years ago, before the prosperity of social media like Youtube, Instagram and TicTok. Living in the era where everyone is watching short videos, QoE of HSR passengers raise little attention of the academic field.
While TCP fails to adjust in dynamic settings on its own, recent video delivery framework like DASH is known for its adaptability to different network situation. Is addressing the contradiction viable? (for the sake of video subscribers)
Thoughts
Converse to the majority, I think we are taking it for granted that the bottleneck of content delivery always lies in LTE just because it “seems” lame comparing with intimidating capacity of data centers.
The mentioned works plausibly hypothesize that LTE network is choking end-to-end throughput in all situations based on observations in papers like 10 and 11, which only indicate it is more costly to achieve higher rate in wireless network from the result of flow size, of which the methodology referenced from wireline network. Threfore, 11 only proves this hypothesis in theory by conjecture that neglects application difference, protocols and network types. In other words, things could be a lot different when applying to the reality where other parts of end-to-end connection should be considered. Furthermore, the papers was published in 2014, indicating their research was conducted even before the first LTE phone model came out (Samsung S5). To my knowledge, LTE is much faster than old-school HSPA network. For the least, it is better to revisit LTE in the wild before accusing cellular network for all of it.
The sided belief made us ignore the events in between: Is it possible that we are under-utilizing the bandwidth during the process of transportation? To answer my question, BurstTracker 12 uses MobileInsight to analyze buffer queue inside celltowers to find out whether a UE is intensively working on video playback or it is somtimes idle. The result suggests LTE is not always the bottleneck and that the TCP middleboxes is to be blamed. Middleboxes are typically deployed as transparent split-TCP proxies that terminate the TCP connection, process or shape the traffic, and accordingly deliver the data to the client (I save a lot of detailed description). In short, SSR could force the application to enter a negative feedback loop. Hence, video streaming can be largely worsened being trapped in it.
However, this on the other hand gives us a hint that maybe video delivery on HSR can be saved because we have excess radio resource to utilize though it remains unclear in HSR network because of worsened channel quality. Besides, since we have more control on application layer protocol, is cross-layer design a solution towards remedy of video delivery on HSR?
Yuan, Zengwen, Yuanjie Li, Chunyi Peng, Songwu Lu, Haotian Deng, Zhaowei Tan, and Taqi Raza. “A machine learning based approach to mobile network analysis.” In 2018 27th International Conference on Computer Communication and Networks (ICCCN), pp. 1-9. IEEE, 2018. ↩ ↩2 ↩3
MobiPerf - M-Lab (measurementlab.net) Its website is down (Jul.2021) but source code is available with no more upgrades since 2014 though. ↩
Q. Xiao, K. Xu, D. Wang, L. Li, and Y. Zhong, “TCP performance over mobile networks in high-speed mobility scenarios,” in Proc. IEEE ICNP, Oct. 2014, pp. 281–286. ↩ ↩2
Yuanjie Li, Chunyi Peng, Zengwen Yuan, Jiayao Li, Haotian Deng, and Tao Wang. Mobileinsight: Extracting and analyzing cellular network information on smartphones. In ACM MobiCom, 2016 ↩ ↩2
Li, Yuanjie, Qianru Li, Zhehui Zhang, Ghufran Baig, Lili Qiu, and Songwu Lu. “Beyond 5g: Reliable extreme mobility management.” In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, pp. 344-358. 2020. ↩
Gong, Zijun, Cheng Li, Fan Jiang, and Moe Z. Win. “Data-Aided Doppler Compensation for High-Speed Railway Communications Over mmWave Bands.” IEEE Transactions on Wireless Communications 20, no. 1 (2020): 520-534. ↩
Li, Li, Ke Xu, Dan Wang, Chunyi Peng, Kai Zheng, Rashid Mijumbi, and Qingyang Xiao. “A longitudinal measurement study of tcp performance and behavior in 3g/4g networks over high speed rails.” IEEE/ACM transactions on networking 25, no. 4 (2017): 2195-2208. ↩ ↩2 ↩3
Li, Li, Ke Xu, Tong Li, Kai Zheng, Chunyi Peng, Dan Wang, Xiangxiang Wang, Meng Shen, and Rashid Mijumbi. “A measurement study on multi-path TCP with multiple cellular carriers on high speed rails.” In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pp. 161-175. 2018. ↩ ↩2
Xu, Yifei, and Songwu Lu. “Device-Based LTE Latency Reduction at the Application Layer.” (2021). ↩
Xu Y, Wang Z, Leong W K, et al. An end-to-end measurement study of modern cellular data networks[C]//International Conference on Passive and Active Network Measurement. Springer, Cham, 2014: 34-45. ↩
Zhang Y, Arvidsson Å, Siekkinen M, et al. Understanding HTTP flow rates in cellular networks[C]//2014 IFIP Networking Conference. IEEE, 2014: 1-8. ↩ ↩2
Balasingam, Arjun, Manu Bansal, Rakesh Misra, Kanthi Nagaraj, Rahul Tandra, Sachin Katti, and Aaron Schulman. “Detecting if LTE is the Bottleneck with BurstTracker.” In The 25th Annual International Conference on Mobile Computing and Networking, pp. 1-15. 2019. ↩