What is eBPF, and why is it so important?

The Extended Berkeley Packet Filter (eBPF) functions constitute a relatively new and powerful set of capabilities embedded in the Linux kernel. First released in 2014 (w/ Linux 3.18) we are seeing accelerating adoption of eBPF for very good reason.

ebpf-kernel-user-usecases

The access that eBPF provides enables a variety of important use-cases in modern cloud-native environments. Use-cases span across application and network performance monitoring, service mesh, load balancing, continuous discovery, dynamic topology and anomaly detection for a variety of development, systems engineering, operations, cloud infrastructure, 5G / IoT, and cybersecurity applications. We discuss these in more detail further below.

The eBPF capabilities enable a (privileged) user to utilize general-purpose code to introspect activities and processes down to the kernel level. Kernel level events are the ground truth of all activities within a system. The eBPF code is event driven; the programs are attached, triggered, or otherwise executed when specific events of interest occur (sometimes referred to as hook-points). These events, or hooks, typically include; system calls, network connections, specific network packets, function calls, as well as the triggering of trace points (Kprobes and uprobes) and other events. In addition to capturing all manner of information about the systems when an event occurs, eBPF programs can format the resulting data structures (maps), perform Boolean functions- search, insert and delete key-value pairs in tables, generate pseudo-random numbers, and flag events so as to make the resulting metadata / telemetry extremely powerful for follow-on analytics. What makes eBPF even more powerful is that when these events/hooks are triggered they can be used to call other helper programs, to interact with the host system, to perform a wide variety of functions.

Because eBPF is fairly new, it requires a compatible Linux kernel version to take advantage of it. Regardless, newer versions of the Linux operating system (4.4 or above), support eBPF so we expect to see the accelerated adoption of eBPF. With eBPF technology we are able to observe and measure nearly any kind of system event, and because eBPF functions run in the kernel they do not require a separate kernel module to be loaded. Consequently eBPF has enabled a new generation of software to introspect and extend the behavior of complex systems and support a variety of new functions; from service orchestration to improved performance monitoring and continuous real-time security capabilities.

 

eBPF Use-Cases for cloud-native observability and cybersecurity

Cloud-native micro-services environments can seem chaotic. Resources are virtual, ephemeral and dynamic. Therefore, functions, IP addresses, applications, hosts, processes, containers, network connections, and data flows are dynamically provisioned, deprovisioned, moved, and (re)instantiated… Consequently, monitoring and managing those systems using legacy monitoring and management tools and systems is practically impossible…since you may not be able to track resources that are dynamic, or you may only have access to aggregate data that is a layer or two away from the source event activity. Having deep continuous, real-time cloud-native observability into the containers, services and operating system that eBPF supports gives you unprecedented access and observability to network events as they happen.

ebpf-bee-image

Source: https://ebpf.io/what-is-ebpf/

As networks operate with increasing speed and scale, the challenges of monitoring, maintaining and securing micro-services environments can be only addressed using eBPF functions. Combining eBPF functions with additional layers of software; in-node processing, as well as helper / worker programs and event driven analytic tools enables new levels of deep introspection, anomaly detection and remediation. While eBPF technology is (merely) an extension of kernel functionality, it is foundational to providing the level of deep visibility and control needed by development, operations and security teams.

Here are some representative use-cases:

Operations / Service Assurance: encompasses a broad set of reliability, availability and service level monitoring. Not only monitoring the performance, health and state of (sub)systems and applications, but also to optimize efficiencies (resource utilization, bandwidth, capacity, …). To do so requires the maximum degree of operational awareness possible. Specifically eBPF is critical to answering such questions as: Are the orchestration, service mesh, applications and network infrastructure providing the expected bandwidth and latency? Are the underlying services all in-place and fully operational?

 

Cybersecurity: situational awareness and a deep understanding of the operation, status, and integrity of all elements within a system is fundamental to security. Knowing what types of data are transiting the networks, what services are being used, what traffic is being encrypted, what forms of encryption are in use, and having visibility into both encrypted and unencrypted data flows are just some examples of the types of improved situational awareness that eBPF capabilities enable. These types of deep systems visibility and knowledge of the resources just mentioned are critical to understanding the current status and vulnerability of systems. Image-5G-Cloud-native-security-whitepaperAs you can understand having access to that type of security data and telemetry supports a range of use-cases and applications that range from identifying rogue/malicious functions, malicious access, intrusions, infiltration, exfiltration… as well as anomaly and malware detection, whitelisting/blacklisting, SIEM, and firewall systems. Read our 5G & Cloud Native Security white paper for further details.

 

Performance Monitoring / Troubleshooting: accessing all the requests, responses, dataflow, dataflow metrics, and application activity down to the lowest (kernel) level and across logical and physical resources (networks, processes, containers, nodes, clusters…) to the infrastructure layer, is critical for accurate performance monitoring and management. This requires deep visibility and the availability of time-series telemetry (timestamps and metadata) and access to the controls that can be utilized by system, network and application monitoring and measurement tools as well as load balancers and service mesh systems. Specifically to gain highly detailed insights in to cloud native systems, and enable control over how all of the different parts of these complex systems, interact, communicate and share data.

 

Security and robustness of eBPF:

Significant care and attention to architectural detail have been taken to ensure that eBPF capabilities are as secure, reliable and robust as possible.

First and foremost— eBPF programs are run in privileged mode. Because eBPF programs have extensive access to the kernel resources in which the systems run on, they require the highest level of security authorization. Consequently, system administration and security best-practices dictate that to enable this level of (privileged mode) access, eBPF programs are subject to special scrutiny with which the maximum authentication, access, and security controls are applied.

Additionally, before loading and execution, the eBPF programs run through extensive validation, and optimization processes. These include the following steps:

  • A verification process ensures that the eBPF program is safe to run. During verification, the eBPF program must pass checks that include executing the program within a virtual machine where it evaluates all execution paths to validate such things as:
    • does it run to completion without any looping (no unbounded loops)?
    • are all the pointer references valid?
    • does the program fit within the size constraints of the system?
    • does the program terminate properly?
    • does the program contain an exit condition which is guaranteed to become true?
  • Next, eBPF programs are run through a just-in-time (JIT) compilation process which translates the program into the machine specific instructions and optimizes execution speed of the program. This makes eBPF programs run as efficiently as natively compiled kernel code

Only after these checks and optimizations are performed is the eBPF program loaded, waiting for the hook event to occur.

 

The vendor agnostic nature of eBPF:

One of the most important considerations is that eBPF technology makes observability and monitoring vendor agnostic. While eBPF programs require the appropriate compatibility, authentication and permissions to be loaded and run in the systems, they do not require the permission of the applications they are monitoring. This is especially important for complex environment such as 5G systems; where the architecture, interfaces and communications (protocols, behaviors, responses) between the underlying software components (container-based network functions or applications) are standardized as dictated by the 3GPP standards.

What this means is that operators are free to choose the best available components, Image-Vendor-agnostic-wpapplications / containerized network functions…, and then eBPF based observability tools can be used to monitor those systems in a vendor agnostic way. Consequently operators are no longer locked into one vendor for the entire technology stack simply because such multi-vendor implementations can’t be monitored and managed due to vendor unique implementations. Read our Vendor Agnostic white paper for further details.

 

The MantisNet CVF

The eBPF technology is just one (very important) technology component of the MantisNet CVF platform. MantisNet CVF platform combines eBPF along with (GoLang driven) in-node processing capabilities, integration with container orchestration systems (Kubernetes and other), along with an open, extensible architecture to provide a new and unique, cutting-edge, high-definition observability solution.

Additionally, the MantisNet CVF is easy to deploy, produces easy-to-ingest and process (serialized) metadata with unique flow identifiers that can be used to correlate events across the most complex infrastructure. The resulting metadata can be ingested or used as an enriched data source into analytic workflows, SQL tools (MongoDB, Redis, Elastic…) as well as with any number of opensource tools for visualizations (Lens…) and dashboards (Grafana…) to implement comprehensive monitoring. With it we can query and aggregate events, actions and connection metrics to access deep, real-time and continuous insight into operations, behaviors, and inter-dependencies.

Currently, the MantisNet CVF platform can provide:

  • Packet Capture
  • Dynamic Topology
  • Advanced Flow Statistics
  • Protocol Specific Processing
  • Monitoring of Encryption Systems and Plain-Text Extraction
  • Live Tracing
    • Note: Because the MantisNet CVF is composable; new features and functions can be developed and added as needed.

Continuous, real-time observability is a key foundational component for the operation of next generation cloud-native, micro-services-based infrastructure. Suffice to say, eBPF is a powerful new observability tool that enables deep visibility and control. The eBPF technology, while not the complete solution provides, a very powerful set of capabilities that enables our customers to focus on monitoring and managing their businesses. Your business: the services, applications, and resources.

 

Additional resources and references:

 

 

Topics: network engineering, network preformance, Real-Time Monitoring, mantis, containers, 5G

Peter Dougherty

Written by Peter Dougherty

Peter Dougherty, CISSP, is a technology entrepreneur, strategist & operating executive with over 25 years of experience developing and delivering cyber security, networking, compute, and storage technologies.