Validating Equal-Cost Multi-Path (ECMP) at Scale
As a best-path network packet routing strategy, equal-cost multi-path (ECMP) has become very popular. Starting primarily as a tool for traffic engineering (TE), it has attained a prominent place in our data center networks. Network engineers are specifically constructing network topologies to optimize ECMP. It is definitely an important tool in the toolbox.
Now, the question is—how effective is this tool? Where is it reliable and what cautions should be taken while choosing an ECMP device or vendor? Despite its immense popularity, some of the crucial operational aspects of ECMP need engineers’ attention and measurements to effectively design an ECMP network.
ECMP typically enables network load sharing, and not load balancing. In general load balancing is distributing the traffic load (throughput) evenly across multiple equal-cost paths. Whereas load sharing is about distributing traffic across active paths that do not have to be equal. Generally, load sharing in ECMP is done by hashing a few selected fields of a packet. So, in normal 5-tuples ECMP (Figure-1), each path may not get an equal share of high-bandwidth and low-bandwidth flows.
To make this better for load sharing, the industry has come out with a few options—and one of them is to perform load balancing based on the size of the flow. Some flows are larger than others, for example transferring flies through common Internet file system (CIFS) will be larger than any keep-alive or heartbeat type flows. Taking flow size into the ECMP calculation also has its own challenges. A flow is assigned to a link at the start of the flow. If we use this strategy, then it may not work equally well all the time. For example, for use with transmission control protocol (TCP). At the beginning, TCP starts with a handshake to establish the session. At this stage, small packets are exchanged and we cannot determine the size of the future flow at the starting stage of TCP. During a later stage, the packet sizes might grow based on the application. It may be a large file download, or it may be a small background flow. So, balancing based on size may not be optimal.
Another option is to move the existing flows as they grow. This may work for situations where some disruption, like a small amount of packet loss/reordering, is acceptable. The second challenge with this approach is that very frequent switching (as the flow size changes over time) will add more to the disruption.
So, before deploying ECMP into the network, it’s critical to clarify your ECMP requirements and expectations. A thorough measurement of network performance in an ECMP device is a key for success for enterprises. This is where the Ixia test solution is the top choice for network designers across the world. I will talk about it towards the end of this document.
The Resiliency Problem
The next set of challenges with ECMP are related to resiliency. Since resiliency is not built into ECMP, highly resilient networks need extra caution. For a flow, generally a path is chosen from a bucket of ECMP paths. This bucket can grow (i.e., more paths become available) or shrinks (i.e., already active paths be deactivated due to link shutdown or change of routes). A network engineer needs to be very clear on how the system responds to these changes. Let us understand the issue with a few pictures.
Whenever the number of available paths changes, the path selection hashing algorithm may give a different result for the same flow. This may cause rebalancing of all the flows, even for the flows whose current path is still active.
In Figure 2, we show a steady state flow distribution where there are three flows and four available paths. Red, Green, and Blue flows are load balanced through Spine-1, Spine-2 and Spine-3. Now the path to Spine-3 gets lost and the ECMP bucket size gets reduced to three from four earlier (Figure-3).
This may cause the hash engine to give different results for the same flows and, in turn, can force all the flows to switch. For example (Figure-4) the green flow may need to switch from Spine-2 (in Figure-2) to Spine-3 (Figure-4). This will happen even if Spine-2 is healthy and alive.
It will trigger further rebalancing when Spine-1 comes back, or additional new paths are added. Interestingly, when Spine-1 comes back (Figure-5), for the red flow, it might not restore back to Figure-1 (i.e., the red flow may not fall back to Spine-1). Every time such a switch happens, it may affect the traffic convergence time, can introduce slight packet reordering, can cause retransmission or drop for the application.
So, the resiliency problem is: whenever there is any change—favorable (i.e., higher number of active path become available) or unfavorable (i.e., some of active paths diminish)—in the available path, it will trigger rebalancing. This rebalancing can affect all the flows, including the healthy ones.
To address these resiliency problems, vendors are coming up with proprietary solutions. Performance and scalability of these solutions differs from vendor to vendor. For example, the number of ECMP paths that can be handled in a resilient manner may vary between vendors. Depending on the implementation, some vendors will be able to handle existing path removals in a resilient manner. Adding new paths over an existing set of active paths may trigger rebalancing of all the flows and some ECMP solutions may not be able to handle this in resilient manner. I’ll save the discussion on mechanisms behind these for another blog.
IxNetwork Simplifies ECMP Validation at Scale
Considering these unknowns, it’s become extremely important to measure the behavior and resiliency of ECMP devices before deploying them in live networks. Ixia’s IxNetwork provides a comprehensive test solution to measure resiliency and efficiency of ECMP. IxNetwork can create flows of various size/patterns, generate link failure in the topology, and measure how fast or slow ECMP can converge.
Using its flow tracking feature at line rate, IxNetwork can measure how ECMP performs in load balancing versus load sharing. IxNetwork’s BGP control plane can be used to dynamically advertise more available paths to a destination or withdraw some of the currently active paths. With a graphical topology configuration, dynamic routing protocol emulation (iBGP, eBGP), and with line-rate traffic generation and measurement (latency, throughput, jitter, convergence, etc.), IxNetwork is the best tool you’ll find for measuring ECMP performance and effectiveness. You can do this testing at scale and at easy with IxNetwork.