As organizations continue to transition to high-speed network infrastructure, engineers responsible for network monitoring find themselves in need of an answer to the question, "How does one go about monitoring 100G network links?" The unique characteristics of high-speed networks make this a serious technical challenge. 

How to Monitor 100G Network Links

For starters, analysis tools do not support 100G native ingest rates. The majority of monitoring tools available today are designed to ingest 1G or 10G traffic at most. If network engineers need to monitor 100% of the traffic found within a 100G segment, there is only one option to get the job done:

Load Balance the 100G traffic across multiple 10G tools.  

In general, one can take two types of approaches when looking to load balance a network link - round robin load balancing, or flow-aware load balancing. Let's take a look at each option...

Round Robin Load Balancing

In round robin load balancing, imagine that you are a card dealer at a crowded blackjack table. You, the dealer, have all the cards and you need to distribute them to the players. Each card represents a single packet found within a 100G link, and you, the dealer, are the load balancer, distributing those cards/packets to the players, who represent 10 different 10G ports. 

MantisNet-Flow-aware-load-balance-vs-100G-Load-Balance-Blog-Post.png

With round robin load balancing, you sequentially distribute the packets to each 10G port, in order. Much like dealing cards, this approach only focuses on spreading information across a group of recipients in an orderly fashion. You present the first packet to the first port, the second packet to the second port, the third to the third, and then start all over again after you send the 10th packet to the 10th port.  

This approach may be adequate for certain load balancing situations, but it is an ill-advised strategy; say if your downstream analytics need to reassemble sessions. Using this tactic, you would find multiple packets that are part of the same connections spread across multiple monitoring ports. Each of the monitoring tools would have incomplete information, and the engineers would be left trying to figure out how to correlate the individual packets that are now spreading across numerous monitoring tools. The good news is there is an intelligent way to load balance your 100G network link. 

Flow-Aware Load Balancing

(Not your momma's round robin...)

What is Flow-aware load balancing? Flow-aware load balancing allows engineers to determine precisely which port each packet is routed to, and to make sure all related packets are sent to that same port. Instead of sequentially distributing packets across multiple monitoring ports (e.g round-robin), you create a system that ensures all related packets end up at the same port. A device which preserves flow-affinity will often generate a 5-tuple of every 100G packet coming in and uses this as the parameter for which port that packet should be sent to. Let's look at the details of how this really works...

In the diagram below, you will see a single packet's 5-tuple information. Every time a packet enters the device, the load balancer creates a 5-tuple hash for the packet. (To be clear, hashes can be created with any information in any of the headers, though a 5-tuple is commonly used.) Once the hash is created, a mathematical formula is applied to the hash in order to generate a unique numerical value (called the "modulo of the hash") which tells the load balancer which port the packet should be sent to. Think of this process as creating a key, value pair. The key is the modulo of the hash, and the value is the 10G monitoring port that the packet will be sent to.

100G-Load-Balancing-Modulo-Hash-MantisNet-Blog-Post.pngBased on the modulo of the hash, the above packet will be sent to monitoring port 1 (egress port 1). As subsequent packets enter the load balancer, the same mathematics are applied to the hashes generated from every incoming packet. Packets that have the same hash (AKA, any packet with the same 5-tuple) will generate the exact same numeric value (the same "modulo of the hash") and will be sent to the same 10G monitoring port. This allows for all related packets to be routed to the same monitoring port, preserving flow affinity. The diagram below helps illustrate this approach - each individual color represents a unique value derived from the modulo of the hash function...

100G Load Balancing | Modulo | MantisNet Blog Post.png

Flow-aware load balancing is the right approach for monitoring a 100G network segment. You can effectively send all related traffic to the same 10G port for analysis, plain and simple. You are now set to use 10G tools on 100G traffic! 

But, now for the catch...

Flow-aware load balancing is the way to go, however, this is more easily said than done. As mentioned earlier in this post, 100G networks have unique characteristics that make them difficult to monitor. We have addressed the first dilemma of 'using 10G tools on 100G traffic' through flow-aware load balancing. However, there is one basic assumption that is made when using flow-aware load balancing: that your load balancer can access the 5-tuple information for every packet.

Using standard ASIC's to identify headers in modern networks is problematic. The reason for this? Modern network traffic comes with lots of extra baggage: multiple MPLS headers, multiple VLAN tags, packets with overlays/underlays and encapsulations of every sort. In-general, large volumes of PITA (Pain In The Ass) packets. Standard ASICs have a fixed set of headers they can identify, and associated parse graphs they can accomodate. Any packet that falls outside the definition of those headers and those graphs will fall through the cracks...

The good news is that this is not an insurmountable obstacle - it can be addressed by using a load balancer built NOT on standard ASICs, but on a fully programmable chipset. Major strides have been made in the development of advanced chipsets in recent years, and we can now develop flow-aware load-balancers that are built on programmable chipsets, such as the MantisNet RFP-NG. Once again, let's dive in to the details to understand why this is important.

Programmable Chipsets vs Static ASICs

As mentioned above, standard ASICs come with a fixed set of packet headers and graphs they can identify, and their forwarding logic is cemented in to the chip itself. This does not lend itself well to load balancing complex packet traffic found within 100G networks-bottom line...packets will be missed. Once again, this comes down to being able to create a hash for every packet.

Programmable chipsets allow for forwarding logic to reside in a program (code) which the network operator loads onto the chip.; it is not hard-wired into the silicon. This turns the chip found within your 100G load balancer into a fully programmable interface. These devices now can parse (and load balance based on any type of hashing) on any type of packet found within a network - and if the lineage of the packet and its graph need updating, the operator has all the tools he or she needs to load new logic on to the chip, and immediately start parsing/load balancing those complex packets. 

And that's it, folks! With flow-aware load balancing and programmable chipsets, 100G monitoring is no longer scary...

Embrace it. Love it. Program it. 

Want to stay current with the latest technology advancements and updates?

Subscribe to the MantisNet Blog

MantisNet

Written by MantisNet