Reliable Layer 4 Stateful TCP Testing Improves QoE

Oct. 26, 2020
In this article, we will explain how test and measurement equipment help users run RFC 6349 tests to provide indicators of QoE. Besides these tests that measure performance parameters, users can run the already well-known Y.1654 tests at lower layers.

Quality of experience (QoE) is a concept that has been around for a long time, but it has gained a lot of attention in recent years, particularly with the exponential growth of internet applications and services that require greater bandwidth. There may be many definitions for QoE, but in short, it is a way of measuring how happy customers are with a product or service contracted from a vendor or service provider.

Why has QoE gained so much momentum in recent years? One answer may be that by measuring QoE, companies collect real and honest data directly from the revenue source, the customer or end-user. Thus, QoE has become a priority with companies looking for ways to improve it. Significant research efforts have been spent on establishing metrics to improve QoE. Some factors include cost, reliability, user-friendliness, and security.

Nowadays network operators and service providers validate their network performance using the ITU-T Y.1564 standard, but even when these performance tests produce good results, some customers may still experience poor throughput. The most common reason for this is that the end-to-end connection of user applications is not provisioned properly at the stateful TCP level.

TCP is a connection-oriented protocol that checks if the information is correct at the receiver end to establish a reliable network connection. Nonetheless, the actual throughput at the transport layer can be degraded compared to the throughput at the Ethernet/IP layer simply because the window size or buffer capacity of the network elements is not set properly. The RFC 6349 standard defines the methodology to test throughput at the transport layer (Layer 4).

In this article, we will explain how test and measurement equipment help users run RFC 6349 tests to provide indicators of QoE. Besides these tests that measure performance parameters, users can run the already well-known Y.1654 tests at lower layers. Using these two sets of tests as testing methodologies help service providers improve QoE.

ITU-T Y.1564 & RFC 6349 – SAMpling network PERFormance

The difference between quality of service (QoS) and QoE is clearly seen in that important videoconference meeting when the speaker’s voice starts cutting out or the video pixelates in the pivotal or most interesting part of the topic. Although the video streaming most likely meets QoS service-level agreement (SLA) minimum values, the user would qualify the experience as an “epic fail.”

An example of good QoS and poor QoE can be seen in best effort internet access services over shared media (e.g., DOCSIS, PON), due to oversubscription. If the network is provisioned for “normal” conditions and suddenly the nation is forced to stay home and live remotely, the network may see more traffic than it can handle.

Service providers have strict SLAs with their customers. An SLA is an agreement stating that the delivered service must meet the specifications set when the service was acquired. Therefore, service providers are adopting combined test methodologies given that testing only QoS it is not enough to meet their SLAs.

Using both ITU-T Y.1564 and RFC 6349 as test methodologies is a layered approach like the TCP/IP model, ensuring quick, easy, and efficient Ethernet service turn up and isolation of issues when troubleshooting.

With the belief that time is money, companies and service providers agree that services must be turned up quickly and efficiently, and the ITU-T Y.1564 test methodology aligns with the requirements of modern Ethernet services. This standard enables complete validation of all SLA parameters within a single test to ensure optimized QoS.

The ITU-T Y.1564 standard, also known as Service Activation Test Methodology (SAM), enables service providers to assess the proper configuration and performance of an Ethernet service prior to customer delivery. This is done by inputting the Service Activation Criteria (SAC) information, which is normally based on a subset of the user’s SLA. This sets simple Pass/Fail criteria and simplifies the results. The test methodology has two main components, the Service Configuration Test and the Service Performance Test.

RFC 6349, on the other hand, specifies a practical methodology for measuring end-to-end TCP throughput in a managed IP network. Its main goal is to provide a better indication of the user experience.

Issued by the Internet Engineering Task Force (IETF), RFC 6349 provides a repeatable test method for TCP throughput analysis with systematic processes, metrics, and guidelines to optimize the network and server performance.

In the RFC 6349 framework, TCP and IP parameters are also specified to optimize TCP throughput.

This Layer 4 test methodology specifies the following three tests:

  1. Path MTU detection (per RFC 4821) – Verifies the network maximum transmission unit (MTU) with active TCP segment size testing to ensure that the TCP payload remains unfragmented.
  2. Baseline round-trip delay and bandwidth – Helps to predict the optimal TCP window size for automatically calculating the TCP bandwidth-delay product (BDP). The TCP BDP is the product of a data link capacity (in bps) and its round-trip delay time(s). In other words, it indicates the maximum number of simultaneous bits in transit between the TX and the RX (see Figure 1).
  3. Single and multiple TCP connection throughput tests – Used to verify TCP window size predictions that enable automated “full pipe” TCP testing.
RFC 6349 provides the following metrics used to diagnose causes for suboptimal TCP performance:
  • TCP Transfer time – Measures the time it takes to transfer a block of data across simultaneous TCP connections.
  • TCP Efficiency – Calculates the percentage of bytes not retransmitted. TCP retransmissions are normal in any TCP/IP network. Determining the number of retransmissions that will affect performance is difficult when simply using the number itself, and here is where TCP Efficiency comes in handy.
  • Buffer Delay Percentage – Represents the increase in round-trip time (RTT) during a TCP throughput test from the baseline RTT, which is the RTT inherent to the network path without congestion.

Today, as most of the population works, learns, entertains, and communicates remotely, service providers have rushed to retrofit their networks to cope with usage surge demands. This perhaps was once thought of as the worst-case scenario but is now the new reality.

New broadband internet services, particularly those surpassing 500 Mbps and emerging ones touting gigabit speeds, have brought new challenges to ISPs, such as bottlenecks in the access network that slow down and sometimes disrupt subscribers’ services. These bottlenecks can also be caused by legacy home or enterprise equipment, like 802.11n wireless routers that cannot keep up with the contracted internet service speed.

Tech-savvy customers may turn to off-the-shelf equipment like smart devices or high-end laptops for checking their internet service speeds. The problem with these CPU-based devices is that they lack the processing power and capabilities to test gigabit speeds at full line rate. For instance, speed tests for line rates above 500 Mbps become unreliable when performed by even high-end laptops and are virtually useless at the 1-Gbps mark or above as show in Figure 2.

In their frustration, customers will undoubtedly blame the service provider since they do not realize their own hardware could be the root cause of the issue.

Service turn-up testing is an important first step to avoid these issues and frustration once the service is delivered. More and more service providers started implementing combined network testing procedures that include ITU-T Y.1564, RFC 6349, and internet speed testing.

Leading test and measurement equipment vendors have integrated both simple download/upload testing to FTP/HTTP servers and RFC 6349 test methods into their products because of the increasing need to perform Layer 4 testing to provide insights into what QoE will be. This stateful TCP/IP (Layer 4) testing builds customer trust in the provider.

The solutions include but are not limited to ITU-T 1564, RFC 6349 for Layer 4 testing applications, and internet speed testing, using centralized TCP servers and/or test heads for 1G, 10G, 40G, and 100G applications distributed in the service provider network (see Figure 3). The test head equipment enables on-demand testing and acts as a server, making field testing easier. In-band results are uploaded from portable units to test heads, and most importantly, ensure reliable and repeatable tests, thanks to FPGA hardware performance advantages.

Conclusion

It is fair to say that the QoE concept lives in the Application Layer (Layer 7) of the OSI model but it is strongly related to Layers 1-4 because to improve QoE, service providers need to make sure all the dots are properly connected in the first four layers. By running ITU-T Y.1564 tests, network operators and service providers can verify that everything is performing as needed in the first three layers.

Repeatable and reliable test methodologies, combined with specialized hardware, are needed to perform Layer 4 stateful TCP throughput performance tests. Good metrics provide a good indication of the QoE because service providers can have a quick snapshot of the service that is being delivered and experienced by the customer.

Every day, more and more network operators and service providers are implementing combined network testing methodologies that include ITU-T Y.1564 for Layers 1-3, RFC 6349 for Layer 4, and internet speed testing with the intention to improve the QoE perceived by their customers.

References

  1. RFC6349 Framework for TCP Throughput Testing - https://tools.ietf.org/html/rfc6349.

Erick Davila joined VeEX in March 2019 as a product manager for the Ethernet & Transport lineup. He is responsible for products like the NET-BOX, TX300s Platform, and RXT-1200 Platform, among others. Before joining VeEX, he worked at Amazon Web Services, Coriant and Sunrise Telecom in several roles including technical and pre-sales support, training, network design and engineering. He has more than 15 years of telecommunications experience, focused on both legacy and next generation transport and Ethernet technologies.

Sponsored Recommendations

Coherent Routing and Optical Transport – Getting Under the Covers

April 11, 2024
Join us as we delve into the symbiotic relationship between IPoDWDM and cutting-edge optical transport innovations, revolutionizing the landscape of data transmission.

From 100G to 1.6T: Navigating Timing in the New Era of High-Speed Optical Networks

Feb. 19, 2024
Discover the dynamic landscape of hyperscale data centers as they embrace accelerated AI/ML growth, propelling a transition from 100G to 400G and even 800G optical connectivity...

Constructing Fiber Networks: The Value of Solutions

March 20, 2024
In designing and provisioning a fiber network, it’s important to think of it as more than a collection of parts. In this webinar, AFL’s Josh Simer will show how a solution mindset...

Advancing Data Center Interconnect

July 31, 2023
Large and hyperscale data center operators are seeing utility in Data Center Interconnect (DCI) to expand their layer two or local area networks across data centers. But the methods...