Extended Berkeley Packet Filter (eBPF) represents a significant evolution in the way we interact with and extend the capabilities of modern operating systems. As a powerful technology that enables the Linux kernel to run sandboxed programs in response to events, eBPF has become a cornerstone for system observability, networking, and security features.
However, as with any system that interfaces closely with the kernel, the security of eBPF itself is paramount. In this blog, we delve into the often-overlooked aspect of eBPF security, exploring how the mechanisms intended to safeguard eBPF can themselves be fortified. We’ll dissect the role of the eBPF verifier, scrutinize the current access control model, and investigate potential improvements from ongoing research. Moreover, we’ll navigate through the complexities of securing eBPF, addressing open questions and the challenges they pose to system architects and developers alike.
The security framework of eBPF is largely predicated on the robustness of its verifier. This component acts as the gatekeeper, ensuring that only safe and compliant programs are allowed to run within the kernel space.
What the eBPF Verifier Is and What It Does
At its core, the eBPF verifier is a static code analyzer. Its primary function is to vet the BPF program instructions before they are executed. It scrutinizes a copy of the program within the kernel, operating with the following objectives:
Ensuring Program Termination
The verifier uses depth-first search (DFS) algorithms to traverse the program’s control flow graph, which it ensures is a Directed Acyclic Graph (DAG). This is crucial for guaranteeing that the program cannot enter into an infinite loop, thereby ensuring its termination. It meticulously checks for any unbounded loops and malformed or out-of-bounds jumps that could disrupt the normal operation of the kernel or lead to a system hang.
Ensuring Memory Safety
Memory safety is paramount in kernel operations. The verifier checks for potential out-of-bounds memory accesses that could lead to data corruption or security breaches. It also safeguards against use-after-free bugs and object leaks, which are common vulnerabilities that can be exploited. In addition to these, it takes into account hardware vulnerabilities like Spectre, enforcing mitigations to prevent such side-channel attacks.
Ensuring Type Safety
Type safety is another critical aspect that the verifier ensures. By preventing type confusion bugs, it helps maintain the integrity of data within the kernel. The eBPF verifier utilizes BPF Type Format (BTF), which allows it to accurately understand and check the kernel’s complex data structures, ensuring that the program’s operations on these structures are valid and safe.
Preventing Hardware Exceptions
Hardware exceptions, such as division by zero, can cause abrupt program terminations and kernel panics. To prevent this, the verifier includes checks for divisions by unknown scalars, ensuring that instructions are rewritten or handled in a manner consistent with aarch64 specifications, which dictate safe handling of such exceptions.
Through these mechanisms, the eBPF verifier plays a critical role in maintaining the security and stability of the kernel, making it an indispensable component of the eBPF infrastructure. It not only reinforces the system’s defenses but also upholds the integrity of operations that eBPF programs intend to perform, making it a quintessential part of the eBPF ecosystem.
How the eBPF Verifier Works
The eBPF verifier is essentially a sophisticated simulation engine that exhaustively tests every possible execution path of a given eBPF program. This simulation is not a mere theoretical exercise but a stringent enforcement of security and safety policies in kernel operations.
Follows control flow graph The verifier begins its analysis by constructing and following the control flow graph (CFG) of the eBPF program. It carefully computes the set of possible states for each instruction, considering the BPF register set and stack. Safety checks are then performed depending on the current instruction context.
One of the critical aspects of this process is register spill/fill tracking for the program’s private BPF stack. This ensures that operations involving the stack do not lead to overflows or underflows, which could corrupt data or provide an attack vector.
Back-edges in control flow graph To effectively manage loops within the eBPF program, the verifier identifies back-edges in the CFG. Bounded loops are handled by simulating all iterations up to a predefined limit, thus guaranteeing that loops will not lead to indefinite execution.
Dealing with potentially large number of states The verifier must manage the complexity that comes with the large number of potential states in a program’s execution paths. It employs path pruning logic to compare the current state with prior states, assessing whether the current path is “equivalent” to prior paths and has a safe exit. This reduces the overall number of states that need to be considered.
Function-by-function verification for state reduction To streamline the verification process, the verifier conducts a function-by-function analysis. This modular approach allows for a reduction in the number of states that need to be analyzed at any given time, thereby improving the efficiency of the verification.
On-demand scalar precision (back-)tracking for state reduction The verifier uses on-demand scalar precision tracking to reduce the state space further. By back-tracking scalar values when necessary, the verifier can more accurately predict the program’s behavior, optimizing its analysis process.
Terminates with rejection upon surpassing “complexity” threshold To maintain practical performance, the verifier has a “complexity” threshold. If a program’s analysis surpasses this threshold, the verifier will terminate the process and reject the program. This ensures that only programs that are within the manageable complexity are allowed to execute, balancing security with system performance.
Challenges
Despite its thoroughness, the eBPF verifier faces significant challenges:
Attractive target for exploitation when exposed to non-root users As the verifier becomes more complex, it becomes an increasingly attractive target for exploitation. The programmability of eBPF, while powerful, also means that if an attacker were to bypass the verifier and gain execution within the OS kernel, the consequences could be severe.
Reasoning about verifier correctness is non-trivial Ensuring the verifier’s correctness, especially concerning Spectre mitigations, is not a straightforward task. While there is some formal verification in place, it is only partial. Areas such as the Just-In-Time (JIT) compilers and abstract interpretation models are particularly challenging.
Occasions where valid programs get rejected There is sometimes a disconnect between the optimizations performed by LLVM (the compiler infrastructure used to prepare eBPF programs) and the verifier’s ability to understand these optimizations, leading to valid programs being erroneously rejected.
“Stable ABI” for BPF program types A “stable ABI” is vital so that BPF programs running in production do not break upon an OS kernel upgrade. However, maintaining this stability while also evolving the verifier and the BPF ecosystem presents its own set of challenges.
Performance vs. security considerations Finally, the eternal trade-off between performance and security is pronounced in the verification of complex eBPF programs. While the verifier must be efficient to be practical, it also must not compromise on security, as the performance of the programs it is verifying is crucial for modern computing systems.
The eBPF verifier stands as a testament to the ingenuity in modern computing security, navigating the treacherous waters between maximum programmability and maintaining a fortress-like defense at the kernel level.
Together, these works signify a robust and multi-faceted research initiative aimed at bolstering the foundations of eBPF verification, ensuring that it remains a secure and performant tool for extending the capabilities of the Linux kernel.
Other reference for you to learn more about eBPF verifier:
After leading Linux distributions, such as Ubuntu and SUSE, have disallowed unprivileged usage of eBPF Socket Filter and CGroup programs, the current eBPF access control model only supports a single permission level. This level necessitates the CAP_SYS_ADMIN capability for all features. However, CAP_SYS_ADMIN carries inherent risks, particularly to containers, due to its extensive privileges.
Addressing this, Linux 5.6 introduces a more granular permission system by breaking down eBPF capabilities. Instead of relying solely on CAP_SYS_ADMIN, a new capability, CAP_BPF, is introduced for invoking the bpf syscall. Additionally, installing specific types of eBPF programs demands further capabilities, such as CAP_PERFMON for performance monitoring or CAP_NET_ADMIN for network administration tasks. This structure aims to mitigate certain types of attacks—like altering process memory or eBPF maps—that still require CAP_SYS_ADMIN.
Nevertheless, these segregated capabilities are not bulletproof against all eBPF-based attacks, such as Denial of Service (DoS) and information theft. Attackers may exploit these to craft eBPF-based malware specifically targeting containers. The emergence of eBPF in cloud-native applications exacerbates this threat, as users could inadvertently deploy containers that contain untrusted eBPF programs.
Compounding the issue, the risks associated with eBPF in containerized environments are not entirely understood. Some container services might unintentionally grant eBPF permissions, for reasons such as enabling filesystem mounting functionality. The existing permission model is inadequate in preventing misuse of these potentially harmful eBPF features within containers.
CAP_BPF
Traditionally, almost all BPF actions required CAP_SYS_ADMIN privileges, which also grant broad system access. Over time, there has been a push to separate BPF permissions from these root privileges. As a result, capabilities like CAP_PERFMON and CAP_BPF were introduced to allow more granular control over BPF operations, such as reading kernel memory and loading tracing or networking programs, without needing full system admin rights.
However, CAP_BPF’s scope is also ambiguous, leading to a perception problem. Unlike CAP_SYS_MODULE, which is well-defined and used for loading kernel modules, CAP_BPF lacks namespace constraints, meaning it can access all kernel memory rather than being container-specific. This broad access is problematic because verifier bugs in BPF programs can crash the kernel, considered a security vulnerability, leading to an excessive number of CVEs (Common Vulnerabilities and Exposures) being filed, even for bugs that are already fixed. This response to verifier bugs creates undue alarm and urgency to patch older kernel versions that may not have been updated.
Additionally, some security startups have been criticized for exploiting the fears around BPF’s capabilities to market their products, paradoxically using BPF itself to safeguard against the issues they highlight. This has led to a contradictory narrative where BPF is both demonized and promoted as a solution.
bpf namespace
The current security model requires the CAP_SYS_ADMIN capability for iterating BPF object IDs and converting these IDs to file descriptors (FDs). This is to prevent non-privileged users from accessing BPF programs owned by others, but it also restricts them from inspecting their own BPF objects, posing a challenge in container environments.
Users can run BPF programs with CAP_BPF and other specific capabilities, yet they lack a generic method to inspect these programs, as tools like bpftool need CAP_SYS_ADMIN. The existing workaround without CAP_SYS_ADMIN is deemed inconvenient, involving SCM_RIGHTS and Unix domain sockets for sharing BPF object FDs between processes.
To address these limitations, Yafang Shao proposes introducing a BPF namespace. This would allow users to create BPF maps, programs, and links within a specific namespace, isolating these objects from users in different namespaces. However, objects within a BPF namespace would still be visible to the parent namespace, enabling system administrators to maintain oversight.
The BPF namespace is conceptually similar to the PID namespace and is intended to be intuitive. The initial implementation focuses on BPF maps, programs, and links, with plans to extend this to other BPF objects like BTF and bpffs in the future. This could potentially enable container users to trace only the processes within their container without accessing data from other containers, enhancing security and usability in containerized environments.
The concept of unprivileged eBPF refers to the ability for non-root users to load eBPF programs into the kernel. This feature is controversial due to security implications and, as such, is currently turned off by default across all major Linux distributions. The concern stems from hardware vulnerabilities like Spectre to kernel bugs and exploits, which can be exploited by malicious eBPF programs to leak sensitive data or attack the system.
To combat this, mitigations have been put in place for various versions of these vulnerabilities, like v1, v2, and v4. However, these mitigations come at a cost, often significantly reducing the flexibility and performance of eBPF programs. This trade-off makes the feature unattractive and impractical for many users and use cases.
Trusted Unprivileged BPF
In light of these challenges, a middle ground known as “trusted unprivileged BPF” is being explored. This approach would involve an allowlist system, where specific eBPF programs that have been thoroughly vetted and deemed trustworthy could be loaded by unprivileged users. This vetting process would ensure that only secure, production-ready programs bypass the privilege requirement, maintaining a balance between security and functionality. It’s a step toward enabling more widespread use of eBPF without compromising the system’s integrity.
Permissive LSM hooks: Rejected upstream given LSMs enforce further restrictions
New Linux Security Module (LSM) hooks specifically for the BPF subsystem, with the intent of offering more granular control over BPF maps and BTF data objects. These are fundamental to the operation of modern BPF applications.
The primary addition includes two LSM hooks: bpf_map_create_security and bpf_btf_load_security, which provide the ability to override the default permission checks that rely on capabilities like CAP_BPF and CAP_NET_ADMIN. This new mechanism allows for finer control, enabling policies to enforce restrictions or bypass checks for trusted applications, shifting the decision-making to custom LSM policy implementations.
This approach allows for a safer default by not requiring applications to have BPF-related capabilities, which are typically required to interact with the kernel’s BPF subsystem. Instead, applications can run without such privileges, with only vetted and trusted cases being granted permission to operate as if they had elevated capabilities.
BPF token concept to delegate subset of BPF via token fd from trusted privileged daemon
the BPF token, a new mechanism allowing privileged daemons to delegate a subset of BPF functionality to trusted unprivileged applications. This concept enables containerized BPF applications to operate safely within user namespaces—a feature previously unattainable due to security restrictions with CAP_BPF capabilities. The BPF token is created and managed via kernel APIs, and it can be pinned within the BPF filesystem for controlled access. The latest version of the patch ensures that a BPF token is confined to its creation instance in the BPF filesystem to prevent misuse. This addition to the BPF subsystem facilitates more secure and flexible unprivileged BPF operations.
BPF signing as gatekeeper: application vs BPF program (no one-size-fits-all)
Song Liu has proposed a patch for unprivileged access to BPF functionality through a new device, /dev/bpf. This device controls access via two new ioctl commands that allow users with write permissions to the device to invoke sys_bpf(). These commands toggle the ability of the current task to call sys_bpf(), with the permission state being stored in the task_struct. This permission is also inheritable by new threads created by the task. A new helper function, bpf_capable(), is introduced to check if a task has obtained permission through /dev/bpf. The patch includes updates to documentation and header files.
RPC to privileged BPF daemon: Limitations depending on use cases/environment
The RPC approach (eg. bpfd) is similar to the BPF token concept, but it uses a privileged daemon to manage the BPF programs. This daemon is responsible for loading and unloading BPF programs, as well as managing the BPF maps. The daemon is also responsible for verifying the BPF programs before loading them. This approach is more flexible than the BPF token concept, as it allows for more fine-grained control over the BPF programs. However, it is also more complex, bring more maintenance challenges and possibilities for single points of failure.
Here are also some research or discussions about how to improve the security of eBPF. Existing works can be roughly divided into three categories: virtualization, Software Fault Isolation (SFI), and formal methods. Use a sandbox like WebAssembly to deploy eBPF programs or run eBPF programs in userspace is also a possible solution.
MOAT: Towards Safe BPF Kernel Extension (Isolation)
The Linux kernel makes considerable use of Berkeley Packet Filter (BPF) to allow user-written BPF applications to execute in the kernel space. BPF employs a verifier to statically check the security of user-supplied BPF code. Recent attacks show that BPF programs can evade security checks and gain unauthorized access to kernel memory, indicating that the verification process is not flawless. In this paper, we present MOAT, a system that isolates potentially malicious BPF programs using Intel Memory Protection Keys (MPK). Enforcing BPF program isolation with MPK is not straightforward; MOAT is carefully designed to alleviate technical obstacles, such as limited hardware keys and supporting a wide variety of kernel BPF helper functions. We have implemented MOAT in a prototype kernel module, and our evaluation shows that MOAT delivers low-cost isolation of BPF programs under various real-world usage scenarios, such as the isolation of a packet-forwarding BPF program for the memcached database with an average throughput loss of 6%.
If we must resort to hardware protection mechanisms, is language safety or verification still necessary to protect the kernel and extensions from one another?
Unleashing Unprivileged eBPF Potential with Dynamic Sandboxing
For safety reasons, unprivileged users today have only limited ways to customize the kernel through the extended Berkeley Packet Filter (eBPF). This is unfortunate, especially since the eBPF framework itself has seen an increase in scope over the years. We propose SandBPF, a software-based kernel isolation technique that dynamically sandboxes eBPF programs to allow unprivileged users to safely extend the kernel, unleashing eBPF’s full potential. Our early proof-of-concept shows that SandBPF can effectively prevent exploits missed by eBPF’s native safety mechanism (i.e., static verification) while incurring 0%-10% overhead on web server benchmarks.
It may be conflict with the original design of eBPF, since it’s not designed to use sandbox to ensure safety. Why not using webassembly in kernel if you want SFI?
Kernel extension verification is untenable
The emergence of verified eBPF bytecode is ushering in a new era of safe kernel extensions. In this paper, we argue that eBPF’s verifier—the source of its safety guarantees—has become a liability. In addition to the well-known bugs and vulnerabilities stemming from the complexity and ad hoc nature of the in-kernel verifier, we highlight a concerning trend in which escape hatches to unsafe kernel functions (in the form of helper functions) are being introduced to bypass verifier-imposed limitations on expressiveness, unfortunately also bypassing its safety guarantees. We propose safe kernel extension frameworks using a balance of not just static but also lightweight runtime techniques. We describe a design centered around kernel extensions in safe Rust that will eliminate the need of the in-kernel verifier, improve expressiveness, allow for reduced escape hatches, and ultimately improve the safety of kernel extensions
It may limits the kernel to load only eBPF programs that are signed by trusted third parties, as the kernel itself can no longer independently verify them. The rust toolchains also has vulnerabilities.
Wasm-bpf: WebAssembly eBPF library, toolchain and runtime
Wasm-bpf is a WebAssembly eBPF library, toolchain and runtime allows the construction of eBPF programs into Wasm with little to no changes to the code, and run them cross platforms with Wasm sandbox.
It provides a configurable environment with limited eBPF WASI behavior, enhancing security and control. This allows for fine-grained permissions, restricting access to kernel resources and providing a more secure environment. For instance, eBPF programs can be restricted to specific types of useage, such as network monitoring, it can also configure what kind of eBPF programs can be loaded in kernel, what kind of attach event it can access without the need for modify kernel eBPF permission models.
It will require additional effort to port the application to WebAssembly. Additionally, Wasm interface of kernel eBPF also need more effort of maintain, as the BPF daemon does.
An userspace eBPF runtime that allows existing eBPF applications to operate in unprivileged userspace using the same libraries and toolchains. It offers Uprobe and Syscall tracepoints for eBPF, with significant performance improvements over kernel uprobe and without requiring manual code instrumentation or process restarts. The runtime facilitates interprocess eBPF maps in userspace shared memory, and is also compatible with kernel eBPF maps, allowing for seamless operation with the kernel’s eBPF infrastructure. It includes a high-performance LLVM JIT for various architectures, alongside a lightweight JIT for x86 and an interpreter.
It may only limited to centain eBPF program types and usecases, not a general approach for kernel eBPF.
Conclusion
As we have traversed the multifaceted domain of eBPF security, it’s clear that while eBPF’s verifier provides a robust first line of defense, there are inherent limitations within the current access control model that require attention. We have considered potential solutions from the realms of virtualization, software fault isolation, and formal methods to WebAssembly or userspace eBPF runtime, each offering unique approaches to fortify eBPF against vulnerabilities.
However, as with any complex system, new questions and challenges continue to surface. The gaps identified between the theoretical security models and their practical implementation invite continued research and experimentation. The future of eBPF security is not only promising but also demands a collective effort to ensure the technology can be adopted with confidence in its capacity to safeguard systems.
This is a list of eBPF related papers I read in recent years, might be helpful for people who are interested in eBPF related research.
eBPF (extended Berkeley Packet Filter) is an emerging technology that allows safe execution of user-provided programs in the Linux kernel. It has gained widespread adoption in recent years for accelerating network processing, enhancing observability, and enabling programmable packet processing.
This document list some key research papers on eBPF over the past few years. The papers cover several aspects of eBPF, including accelerating distributed systems, storage, and networking, formally verifying the eBPF JIT compiler and verifier, applying eBPF for intrusion detection, and automatically generating hardware designs from eBPF programs.
Some key highlights:
eBPF enables executing custom functions in the kernel to accelerate distributed protocols, storage engines, and networking applications with improved throughput and lower latency compared to traditional userspace implementations.
Formal verification of eBPF components like JIT and verifier ensures correctness and reveals bugs in real-world implementations.
eBPF’s programmability and efficiency make it suitable for building intrusion detection and network monitoring applications entirely in the kernel.
Automated synthesis of hardware designs from eBPF programs allows software developers to quickly generate optimized packet processing pipelines in network cards.
The papers demonstrate eBPF’s versatility in accelerating systems, enhancing security, and simplifying network programming. As eBPF adoption grows, it is an important area of systems research with many open problems related to performance, safety, hardware integration, and ease of use.
If you have any suggestions or adding papers, please feel free to open an issue or PR. The list was created in 2023.10, New papers will be added in the future.
Check out our open-source projects at eunomia-bpf and eBPF tutorials at bpf-developer-tutorial. I’m also looking for a PhD position in the area of systems and networking in 2024/2025. My Github and email.
XRP: In-Kernel Storage Functions with eBPF
With the emergence of microsecond-scale NVMe storage devices, the Linux kernel storage stack overhead has become significant, almost doubling access times. We present XRP, a framework that allows applications to execute user-defined storage functions, such as index lookups or aggregations, from an eBPF hook in the NVMe driver, safely bypassing most of the kernel’s storage stack. To preserve file system semantics, XRP propagates a small amount of kernel state to its NVMe driver hook where the user-registered eBPF functions are called. We show how two key-value stores, BPF-KV, a simple B+-tree key-value store, and WiredTiger, a popular log-structured merge tree storage engine, can leverage XRP to significantly improve throughput and latency.
Specification and verification in the field: Applying formal methods to BPF just-in-time compilers in the Linux kernel
This paper describes our experience applying formal methods to a critical component in the Linux kernel, the just-in-time compilers (“JITs”) for the Berkeley Packet Filter (BPF) virtual machine. We verify these JITs using Jitterbug, the first framework to provide a precise specification of JIT correctness that is capable of ruling out real-world bugs, and an automated proof strategy that scales to practical implementations. Using Jitterbug, we have designed, implemented, and verified a new BPF JIT for 32-bit RISC-V, found and fixed 16 previously unknown bugs in five other deployed JITs, and developed new JIT optimizations; all of these changes have been upstreamed to the Linux kernel. The results show that it is possible to build a verified component within a large, unverified system with careful design of specification and proof strategy.
λ-IO: A Unified IO Stack for Computational Storage
The emerging computational storage device offers an opportunity for in-storage computing. It alleviates the overhead of data movement between the host and the device, and thus accelerates data-intensive applications. In this paper, we present λ-IO, a unified IO stack managing both computation and storage resources across the host and the device. We propose a set of designs – interface, runtime, and scheduling – to tackle three critical issues. We implement λ-IO in full-stack software and hardware environment, and evaluate it with synthetic and real applications against Linux IO, showing up to 5.12× performance improvement.
Extension Framework for File Systems in User space
User file systems offer numerous advantages over their in-kernel implementations, such as ease of development and better system reliability. However, they incur heavy performance penalty. We observe that existing user file system frameworks are highly general; they consist of a minimal interposition layer in the kernel that simply forwards all low-level requests to user space. While this design offers flexibility, it also severely degrades performance due to frequent kernel-user context switching.
This work introduces ExtFUSE, a framework for developing extensible user file systems that also allows applications to register “thin” specialized request handlers in the kernel to meet their specific operative needs, while retaining the complex functionality in user space. Our evaluation with two FUSE file systems shows that ExtFUSE can improve the performance of user file systems with less than a few hundred lines on average. ExtFUSE is available on GitHub.
Electrode: Accelerating Distributed Protocols with eBPF
Implementing distributed protocols under a standard Linux kernel networking stack enjoys the benefits of load-aware CPU scaling, high compatibility, and robust security and isolation. However, it suffers from low performance because of excessive user-kernel crossings and kernel networking stack traversing. We present Electrode with a set of eBPF-based performance optimizations designed for distributed protocols. These optimizations get executed in the kernel before the networking stack but achieve similar functionalities as were implemented in user space (e.g., message broadcasting, collecting quorum of acknowledgments), thus avoiding the overheads incurred by user-kernel crossings and kernel networking stack traversing. We show that when applied to a classic Multi-Paxos state machine replication protocol, Electrode improves its throughput by up to 128.4% and latency by up to 41.7%.
BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing
In-memory key-value stores are critical components that help scale large internet services by providing low-latency access to popular data. Memcached, one of the most popular key-value stores, suffers from performance limitations inherent to the Linux networking stack and fails to achieve high performance when using high-speed network interfaces. While the Linux network stack can be bypassed using DPDK based solutions, such approaches require a complete redesign of the software stack and induce high CPU utilization even when client load is low.
To overcome these limitations, we present BMC, an in-kernel cache for Memcached that serves requests before the execution of the standard network stack. Requests to the BMC cache are treated as part of the NIC interrupts, which allows performance to scale with the number of cores serving the NIC queues. To ensure safety, BMC is implemented using eBPF. Despite the safety constraints of eBPF, we show that it is possible to implement a complex cache service. Because BMC runs on commodity hardware and requires modification of neither the Linux kernel nor the Memcached application, it can be widely deployed on existing systems. BMC optimizes the processing time of Facebook-like small-size requests. On this target workload, our evaluations show that BMC improves throughput by up to 18x compared to the vanilla Memcached application and up to 6x compared to an optimized version of Memcached that uses the SO_REUSEPORT socket flag. In addition, our results also show that BMC has negligible overhead and does not deteriorate throughput when treating non-target workloads.
hXDP: Efficient Software Packet Processing on FPGA NICs
FPGA accelerators on the NIC enable the offloading of expensive packet processing tasks from the CPU. However, FPGAs have limited resources that may need to be shared among diverse applications, and programming them is difficult.
We present a solution to run Linux’s eXpress Data Path programs written in eBPF on FPGAs, using only a fraction of the available hardware resources while matching the performance of high-end CPUs. The iterative execution model of eBPF is not a good fit for FPGA accelerators. Nonetheless, we show that many of the instructions of an eBPF program can be compressed, parallelized or completely removed, when targeting a purpose-built FPGA executor, thereby significantly improving performance. We leverage that to design hXDP, which includes (i) an optimizing-compiler that parallelizes and translates eBPF bytecode to an extended eBPF Instruction-set Architecture defined by us; a (ii) soft-processor to execute such instructions on FPGA; and (iii) an FPGA-based infrastructure to provide XDP’s maps and helper functions as defined within the Linux kernel.
We implement hXDP on an FPGA NIC and evaluate it running real-world unmodified eBPF programs. Our implementation is clocked at 156.25MHz, uses about 15% of the FPGA resources, and can run dynamically loaded programs. Despite these modest requirements, it achieves the packet processing throughput of a high-end CPU core and provides a 10x lower packet forwarding latency.
Network-Centric Distributed Tracing with DeepFlow: Troubleshooting Your Microservices in Zero Code
Microservices are becoming more complicated, posing new challenges for traditional performance monitoring solutions. On the one hand, the rapid evolution of microservices places a significant burden on the utilization and maintenance of existing distributed tracing frameworks. On the other hand, complex infrastructure increases the probability of network performance problems and creates more blind spots on the network side. In this paper, we present DeepFlow, a network-centric distributed tracing framework for troubleshooting microservices. DeepFlow provides out-of-the-box tracing via a network-centric tracing plane and implicit context propagation. In addition, it eliminates blind spots in network infrastructure, captures network metrics in a low-cost way, and enhances correlation between different components and layers. We demonstrate analytically and empirically that DeepFlow is capable of locating microservice performance anomalies with negligible overhead. DeepFlow has already identified over 71 critical performance anomalies for more than 26 companies and has been utilized by hundreds of individual developers. Our production evaluations demonstrate that DeepFlow is able to save users hours of instrumentation efforts and reduce troubleshooting time from several hours to just a few minutes.
The extended Berkeley Packet Filter (eBPF) is an infrastructure that allows to dynamically load and run micro-programs directly in the Linux kernel without recompiling it.
In this work, we study how to develop high-performance network measurements in eBPF. We take sketches as case-study, given their ability to support a wide-range of tasks while providing low-memory footprint and accuracy guarantees. We implemented NitroSketch, the state-of-the-art sketch for user-space networking and show that best practices in user-space networking cannot be directly applied to eBPF, because of its different performance characteristics. By applying our lesson learned we improve its performance by 40% compared to a naive implementation.
SPRIGHT: extracting the server from serverless computing! high-performance eBPF-based event-driven, shared-memory processing
Serverless computing promises an efficient, low-cost compute capability in cloud environments. However, existing solutions, epitomized by open-source platforms such as Knative, include heavyweight components that undermine this goal of serverless computing. Additionally, such serverless platforms lack dataplane optimizations to achieve efficient, high-performance function chains that facilitate the popular microservices development paradigm. Their use of unnecessarily complex and duplicate capabilities for building function chains severely degrades performance. ‘Cold-start’ latency is another deterrent.
We describe SPRIGHT, a lightweight, high-performance, responsive serverless framework. SPRIGHT exploits shared memory processing and dramatically improves the scalability of the dataplane by avoiding unnecessary protocol processing and serialization-deserialization overheads. SPRIGHT extensively leverages event-driven processing with the extended Berkeley Packet Filter (eBPF). We creatively use eBPF’s socket message mechanism to support shared memory processing, with overheads being strictly load-proportional. Compared to constantly-running, polling-based DPDK, SPRIGHT achieves the same dataplane performance with 10× less CPU usage under realistic workloads. Additionally, eBPF benefits SPRIGHT, by replacing heavyweight serverless components, allowing us to keep functions ‘warm’ with negligible penalty.
Our preliminary experimental results show that SPRIGHT achieves an order of magnitude improvement in throughput and latency compared to Knative, while substantially reducing CPU usage, and obviates the need for ‘cold-start’.
System call filtering is a widely used security mechanism for protecting a shared OS kernel against untrusted user applications. However, existing system call filtering techniques either are too expensive due to the context switch overhead imposed by userspace agents, or lack sufficient programmability to express advanced policies. Seccomp, Linux’s system call filtering module, is widely used by modern container technologies, mobile apps, and system management services. Despite the adoption of the classic BPF language (cBPF), security policies in Seccomp are mostly limited to static allow lists, primarily because cBPF does not support stateful policies. Consequently, many essential security features cannot be expressed precisely and/or require kernel modifications. In this paper, we present a programmable system call filtering mechanism, which enables more advanced security policies to be expressed by leveraging the extended BPF language (eBPF). More specifically, we create a new Seccomp eBPF program type, exposing, modifying or creating new eBPF helper functions to safely manage filter state, access kernel and user state, and utilize synchronization primitives. Importantly, our system integrates with existing kernel privilege and capability mechanisms, enabling unprivileged users to install advanced filters safely. Our evaluation shows that our eBPF-based filtering can enhance existing policies (e.g., reducing the attack surface of early execution phase by up to 55.4% for temporal specialization), mitigate real-world vulnerabilities, and accelerate filters.
Cross Container Attacks: The Bewildered eBPF on Clouds
The extended Berkeley Packet Filter (eBPF) provides powerful and flexible kernel interfaces to extend the kernel functions for user space programs via running bytecode directly in the kernel space. It has been widely used by cloud services to enhance container security, network management, and system observability. However, we discover that the offensive eBPF that have been extensively discussed in Linux hosts can bring new attack surfaces to containers. With eBPF tracing features, attackers can break the container’s isolation and attack the host, e.g., steal sensitive data, DoS, and even escape the container. In this paper, we study the eBPF-based cross container attacks and reveal their security impacts in real world services. With eBPF attacks, we successfully compromise five online Jupyter/Interactive Shell services and the Cloud Shell of Google Cloud Platform. Furthermore, we find that the Kubernetes services offered by three leading cloud vendors can be exploited to launch cross-node attacks after the attackers escape the container via eBPF. Specifically, in Alibaba’s Kubernetes services, attackers can compromise the whole cluster by abusing their over-privileged cloud metrics or management Pods. Unfortunately, the eBPF attacks on containers are seldom known and can hardly be discovered by existing intrusion detection systems. Also, the existing eBPF permission model cannot confine the eBPF and ensure secure usage in shared-kernel container environments. To this end, we propose a new eBPF permission model to counter the eBPF attacks in containers.
This paper examines the security of eBPF and WebAssembly (Wasm), two technologies that have gained widespread adoption in recent years, despite being designed for very different use cases and environments. While eBPF is a technology primarily used within operating system kernels such as Linux, Wasm is a binary instruction format designed for a stack-based virtual machine with use cases extending beyond the web. Recognizing the growth and expanding ambitions of eBPF, Wasm may provide instructive insights, given its design around securely executing arbitrary untrusted programs in complex and hostile environments such as web browsers and clouds. We analyze the security goals, community evolution, memory models, and execution models of both technologies, and conduct a comparative security assessment, exploring memory safety, control flow integrity, API access, and side-channels. Our results show that eBPF has a history of focusing on performance first and security second, while Wasm puts more emphasis on security at the cost of some runtime overheads. Considering language-based restrictions for eBPF and a security model for API access are fruitful directions for future work.
eBPF is a new technology which allows dynamically loading pieces of code into the Linux kernel. It can greatly speed up networking since it enables the kernel to process certain packets without the involvement of a userspace program. So far eBPF has been used for simple packet filtering applications such as firewalls or Denial of Service protection. We show that it is possible to develop a flow based network intrusion detection system based on machine learning entirely in eBPF. Our solution uses a decision tree and decides for each packet whether it is malicious or not, considering the entire previous context of the network flow. We achieve a performance increase of over 20% compared to the same solution implemented as a userspace program.
Femto-containers: lightweight virtualization and fault isolation for small software functions on low-power IoT microcontrollers
Low-power operating system runtimes used on IoT microcontrollers typically provide rudimentary APIs, basic connectivity and, sometimes, a (secure) firmware update mechanism. In contrast, on less constrained hardware, networked software has entered the age of serverless, microservices and agility. With a view to bridge this gap, in the paper we design Femto-Containers, a new middleware runtime which can be embedded on heterogeneous low-power IoT devices. Femto-Containers enable the secure deployment, execution and isolation of small virtual software functions on low-power IoT devices, over the network. We implement Femto-Containers, and provide integration in RIOT, a popular open source IoT operating system. We then evaluate the performance of our implementation, which was formally verified for fault-isolation, guaranteeing that RIOT is shielded from logic loaded and executed in a Femto-Container. Our experiments on various popular micro-controller architectures (Arm Cortex-M, ESP32 and RISC-V) show that Femto-Containers offer an attractive trade-off in terms of memory footprint overhead, energy consumption, and security.
In today’s technology landscape, with the rise of microservices, cloud-native applications, and complex distributed systems, observability of systems has become a crucial factor in ensuring their health, performance, and security. Especially in a microservices architecture, application components may be distributed across multiple containers and servers, making traditional monitoring methods often insufficient to provide the depth and breadth needed to fully understand the behavior of the system. This is where observing seven-layer protocols such as HTTP, gRPC, MQTT, and more becomes particularly important.
Seven-layer protocols provide detailed insights into how applications interact with other services and components. In a microservices environment, understanding these interactions is vital, as they often serve as the root causes of performance bottlenecks, failures, and security issues. However, monitoring these protocols is not a straightforward task. Traditional network monitoring tools like tcpdump, while effective at capturing network traffic, often fall short when dealing with the complexity and dynamism of seven-layer protocols.
This is where eBPF (extended Berkeley Packet Filter) technology comes into play. eBPF allows developers and operators to delve deep into the kernel layer, observing and analyzing system behavior in real-time without the need to modify or insert instrumentation into application code. This presents a unique opportunity to handle application layer traffic more simply and efficiently, particularly in microservices environments.
In this tutorial, we will delve into the following:
Tracking seven-layer protocols such as HTTP and the challenges associated with them.
eBPF’s socket filter and syscall tracing: How these two technologies assist in tracing HTTP network request data at different kernel layers, and the advantages and limitations of each.
eBPF practical tutorial: How to develop an eBPF program and utilize eBPF socket filter or syscall tracing to capture and analyze HTTP traffic.
As network traffic increases and applications grow in complexity, gaining a deeper understanding of seven-layer protocols becomes increasingly important. Through this tutorial, you will acquire the necessary knowledge and tools to more effectively monitor and analyze your network traffic, ultimately enhancing the performance of your applications and servers.
This article is part of the eBPF Developer Tutorial, and for more detailed content, you can visit here. The source code is available on the GitHub repository.
Challenges in Tracking HTTP, HTTP/2, and Other Seven-Layer Protocols
In the modern networking environment, seven-layer protocols extend beyond just HTTP. In fact, there are many seven-layer protocols such as HTTP/2, gRPC, MQTT, WebSocket, AMQP, and SMTP, each serving critical roles in various application scenarios. These protocols provide detailed insights into how applications interact with other services and components. However, tracking these protocols is not a simple task, especially within complex distributed systems.
Diversity and Complexity: Each seven-layer protocol has its specific design and workings. For example, gRPC utilizes HTTP/2 as its transport protocol and supports multiple languages, while MQTT is a lightweight publish/subscribe messaging transport protocol designed for low-bandwidth and unreliable networks.
Dynamism: Many seven-layer protocols are dynamic, meaning their behavior can change based on network conditions, application requirements, or other factors.
Encryption and Security: With increased security awareness, many seven-layer protocols employ encryption technologies such as TLS/SSL. This introduces additional challenges for tracking and analysis, as decrypting traffic is required for in-depth examination.
High-Performance Requirements: In high-traffic production environments, capturing and analyzing traffic for seven-layer protocols can impact system performance. Traditional network monitoring tools may struggle to handle a large number of concurrent sessions.
Data Completeness and Continuity: Unlike tools like tcpdump, which capture individual packets, tracking seven-layer protocols requires capturing complete sessions, which may involve multiple packets. This necessitates tools capable of correctly reassembling and parsing these packets to provide a continuous session view.
Code Intrusiveness: To gain deeper insights into the behavior of seven-layer protocols, developers may need to modify application code to add monitoring functionalities. This not only increases development and maintenance complexity but can also impact application performance.
As mentioned earlier, eBPF provides a powerful solution, allowing us to capture and analyze seven-layer protocol traffic in the kernel layer without modifying application code. This approach not only offers insights into system behavior but also ensures optimal performance and efficiency. This is why eBPF has become the preferred technology for modern observability tools, especially in production environments that demand high performance and low latency.
eBPF Socket Filter vs. Syscall Tracing: In-Depth Analysis and Comparison
eBPF Socket Filter
What Is It? eBPF socket filter is an extension of the classic Berkeley Packet Filter (BPF) that allows for more advanced packet filtering directly within the kernel. It operates at the socket layer, enabling fine-grained control over which packets are processed by user-space applications.
Key Features:
Performance: By handling packets directly within the kernel, eBPF socket filters reduce the overhead of context switches between user and kernel spaces.
Flexibility: eBPF socket filters can be attached to any socket, providing a universal packet filtering mechanism for various protocols and socket types.
Programmability: Developers can write custom eBPF programs to define complex filtering logic beyond simple packet matching.
Use Cases:
Traffic Control: Restrict or prioritize traffic based on custom conditions.
Security: Discard malicious packets before they reach user-space applications.
Monitoring: Capture specific packets for analysis without affecting other traffic.
eBPF Syscall Tracing
What Is It? System call tracing using eBPF allows monitoring and manipulation of system calls made by applications. System calls are the primary mechanism through which user-space applications interact with the kernel, making tracing them a valuable way to understand application behavior.
Key Features:
Granularity: eBPF allows tracing specific system calls, even specific parameters within those system calls.
Low Overhead: Compared to other tracing methods, eBPF syscall tracing is designed to have minimal performance impact.
Security: Kernel validates eBPF programs to ensure they do not compromise system stability.
How It Works: eBPF syscall tracing typically involves attaching eBPF programs to tracepoints or kprobes related to the system calls being traced. When the traced system call is invoked, the eBPF program is executed, allowing data collection or even modification of system call parameters.
Comparison of eBPF Socket Filter and Syscall Tracing
Aspect
eBPF Socket Filter
eBPF Syscall Tracing
Operational Layer
Socket layer, primarily dealing with network packets received from or sent to sockets.
System call layer, monitoring and potentially altering the behavior of system calls made by applications.
Primary Use Cases
Mainly used for filtering, monitoring, and manipulation of network packets.
Used for performance analysis, security monitoring, and debugging of interactions with the network.
Granularity
Focuses on individual network packets.
Can monitor a wide range of system activities, including those unrelated to networking.
Tracking HTTP Traffic
Can be used to filter and capture HTTP packets passed through sockets.
Can trace system calls associated with networking operations, which may include HTTP traffic.
In summary, both eBPF socket filters and syscall tracing can be used to trace HTTP traffic, but socket filters are more direct and suitable for this purpose. However, if you are interested in the broader context of how an application interacts with the system (e.g., which system calls lead to HTTP traffic), syscall tracing can be highly valuable. In many advanced observability setups, both tools may be used simultaneously to provide a comprehensive view of system and network behavior.
Capturing HTTP Traffic with eBPF Socket Filter
eBPF code consists of user-space and kernel-space components, and here we primarily focus on the kernel-space code. Below is the main logic for capturing HTTP traffic in the kernel using eBPF socket filter technology, and the complete code is provided:
This is the entry point of the eBPF program, defining a function named socket_handler that the kernel uses to handle incoming network packets. This function is located in an eBPF section named socket, indicating that it is intended for socket handling.
In this code block, several variables are defined to store information needed during packet processing. These variables include struct so_event *e for storing event information, verlen, proto, nhoff, ip_proto, tcp_hdr_len, tlen, payload_offset, payload_length, and hdr_len for storing packet information.
struct so_event *e;: This is a pointer to the so_event structure for storing captured event information. The specific definition of this structure is located elsewhere in the program.
__u8 verlen;, __u16 proto;, __u32 nhoff = ETH_HLEN;: These variables are used to store various pieces of information, such as protocol types, packet offsets, etc. nhoff is initialized to the length of the Ethernet frame header, typically 14 bytes, as Ethernet frame headers include destination MAC address, source MAC address, and frame type fields.
__u32 ip_proto = 0;: This variable is used to store the type of the IP protocol and is initialized to 0.
__u32 tcp_hdr_len = 0;: This variable is used to store the length of the TCP header and is initialized to 0.
__u16 tlen;: This variable is used to store the total length of the IP packet.
__u32 payload_offset = 0;, __u32 payload_length = 0;: These two variables are used to store the offset and length of the HTTP request payload.
__u8 hdr_len;: This variable is used to store the length of the IP header.
1 2 3 4
bpf_skb_load_bytes(skb, 12, &proto, 2); proto = __bpf_ntohs(proto); if (proto != ETH_P_IP) return0;
Here, the code loads the Ethernet frame type field from the packet, which tells us the network layer protocol being used in the packet. It then uses the __bpf_ntohs function to convert the network byte order type field into host byte order. Next, the code checks if the type field is not equal to the Ethernet frame type for IPv4 (0x0800). If it’s not equal, it means the packet is not an IPv4 packet, and the function returns 0, indicating that the packet should not be processed.
Key concepts to understand here:
Ethernet Frame: The Ethernet frame is a data link layer (Layer 2) protocol used for transmitting data frames within a local area network (LAN). Ethernet frames typically include destination MAC address, source MAC address, and frame type fields.
Network Byte Order: Network protocols often use big-endian byte order to represent data. Therefore, data received from the network needs to be converted into host byte order for proper interpretation on the host. Here, the type field from the network is converted to host byte order for further processing.
IPv4 Frame Type (ETH_P_IP): This represents the frame type field in the Ethernet frame, where 0x0800 indicates IPv4.
1 2
if (ip_is_fragment(skb, nhoff)) return0;
This part of the code checks if IP fragmentation is being handled. IP fragmentation is a mechanism for splitting larger IP packets into multiple smaller fragments for transmission. Here, if the packet is an IP fragment, the function returns 0, indicating that only complete packets will be processed.
The above code is a helper function used to check if the incoming IPv4 packet is an IP fragment. IP fragmentation is a mechanism where, if the size of an IP packet exceeds the Maximum Transmission Unit (MTU) of the network, routers split it into smaller fragments for transmission across the network. The purpose of this function is to examine the fragment flags and fragment offset fields within the packet to determine if it is a fragment.
Here’s an explanation of the code line by line:
__u16 frag_off;: Defines a 16-bit unsigned integer variable frag_off to store the fragment offset field.
bpf_skb_load_bytes(skb, nhoff + offsetof(struct iphdr, frag_off), &frag_off, 2);: This line of code uses the bpf_skb_load_bytes function to load the fragment offset field from the packet. nhoff is the offset of the IP header within the packet, and offsetof(struct iphdr, frag_off) calculates the offset of the fragment offset field within the IPv4 header.
frag_off = __bpf_ntohs(frag_off);: Converts the loaded fragment offset field from network byte order (big-endian) to host byte order. Network protocols typically use big-endian to represent data, and the conversion to host byte order is done for further processing.
return frag_off & (IP_MF | IP_OFFSET);: This line of code checks the value of the fragment offset field using a bitwise AND operation with two flag values:
IP_MF: Represents the “More Fragments” flag. If this flag is set to 1, it indicates that the packet is part of a fragmented sequence and more fragments are expected.
IP_OFFSET: Represents the fragment offset field. If the fragment offset field is non-zero, it indicates that the packet is part of a fragmented sequence and has a fragment offset value. If either of these flags is set to 1, the result is non-zero, indicating that the packet is an IP fragment. If both flags are 0, it means the packet is not fragmented.
It’s important to note that the fragment offset field in the IP header is specified in units of 8 bytes, so the actual byte offset is obtained by left-shifting the value by 3 bits. Additionally, the “More Fragments” flag (IP_MF) in the IP header indicates whether there are more fragments in the sequence and is typically used in conjunction with the fragment offset field to indicate the status of fragmented packets.
In this part of the code, the length of the IP header is loaded from the packet. The IP header length field contains information about the length of the IP header in units of 4 bytes, and it needs to be converted to bytes. Here, it is converted by performing a bitwise AND operation with 0x0f and then multiplying it by 4.
Key concept:
IP Header: The IP header contains fundamental information about a packet, such as the source IP address, destination IP address, protocol type, total length, identification, flags, fragment offset, time to live (TTL), checksum, source port, and destination port.
1 2 3 4
if (hdr_len < sizeof(struct iphdr)) { return0; }
This code segment checks if the length of the IP header meets the minimum length requirement, typically 20 bytes. If the length of the IP header is less than 20 bytes, it indicates an incomplete or corrupted packet, and the function returns 0, indicating that the packet should not be processed.
Key concept:
struct iphdr: This is a structure defined in the Linux kernel, representing the format of an IPv4 header. It includes fields such as version, header length, service type, total length, identification, flags, fragment offset, time to live, protocol, header checksum, source IP address, and destination IP address, among others.
Here, the code loads the protocol field from the IP header to determine the transport layer protocol used in the packet. Then, it checks if the protocol field is not equal to the value for TCP (IPPROTO_TCP). If it’s not TCP, it means the packet is not an HTTP request or response, and the function returns 0.
Key concept:
Transport Layer Protocol: The protocol field in the IP header indicates the transport layer protocol used in the packet, such as TCP, UDP, or ICMP.
1
tcp_hdr_len = nhoff + hdr_len;
This line of code calculates the offset of the TCP header. It adds the length of the Ethernet frame header (nhoff) to the length of the IP header (hdr_len) to obtain the starting position of the TCP header.
1
bpf_skb_load_bytes(skb, nhoff + 0, &verlen, 1);
This line of code loads the first byte of the TCP header from the packet, which contains information about the TCP header length. This length field is specified in units of 4 bytes and requires further conversion.
This line of code loads the total length field of the IP header from the packet. The IP header’s total length field represents the overall length of the IP packet, including both the IP header and the data portion.
This piece of code is used to calculate the length of the TCP header. It loads the Data Offset field (also known as the Header Length field) from the TCP header, which represents the length of the TCP header in units of 4 bytes. The code clears the high four bits of the offset field, then shifts it right by 4 bits, and finally multiplies it by 4 to obtain the actual length of the TCP header.
Key points to understand:
TCP Header: The TCP header contains information related to the TCP protocol, such as source port, destination port, sequence number, acknowledgment number, flags (e.g., SYN, ACK, FIN), window size, and checksum.
These two lines of code calculate the offset and length of the HTTP request payload. They add the lengths of the Ethernet frame header, IP header, and TCP header together to obtain the offset to the data portion of the HTTP request. Then, by subtracting the total length, IP header length, and TCP header length from the total length field, they calculate the length of the HTTP request data.
Key point:
HTTP Request Payload: The actual data portion included in an HTTP request, typically consisting of the HTTP request headers and request body.
This portion of the code loads the first 7 bytes of the HTTP request line and stores them in a character array named line_buffer. It then checks if the length of the HTTP request data is less than 7 bytes or if the offset is negative. If these conditions are met, it indicates an incomplete HTTP request, and the function returns 0. Finally, it uses the bpf_printk function to print the content of the HTTP request line to the kernel log for debugging and analysis.
This piece of code uses the bpf_strncmp function to compare the data in line_buffer with HTTP request methods (GET, POST, PUT, DELETE, HTTP). If there is no match, indicating that it is not an HTTP request, it returns 0, indicating that it should not be processed.
1 2 3
e = bpf_ringbuf_reserve(&rb, sizeof(*e), 0); if (!e) return0;
This section of the code attempts to reserve a block of memory from the BPF ring buffer to store event information. If it cannot reserve the memory block, it returns 0. The BPF ring buffer is used to pass event data between the eBPF program and user space.
Key point:
BPF Ring Buffer: The BPF ring buffer is a mechanism for passing data between eBPF programs and user space. It can be used to store event information for further processing or analysis by user space applications.
Finally, this code segment stores the captured event information in the e structure and submits it to the BPF ring buffer. It includes information such as the captured IP protocol, source and destination ports, packet type, interface index, payload length, source IP address, and destination IP address. Finally, it returns the length of the packet, indicating that the packet was successfully processed.
This code is primarily used to store captured event information for further processing. The BPF ring buffer is used to pass this information to user space for additional handling or logging.
In summary, this eBPF program’s main task is to capture HTTP requests. It accomplishes this by parsing the Ethernet frame, IP header, and TCP header of incoming packets to determine if they contain HTTP requests. Information about the requests is then stored in the so_event structure and submitted to the BPF ring buffer. This is an efficient method for capturing HTTP traffic at the kernel level and is suitable for applications such as network monitoring and security analysis.
Potential Limitations
The above code has some potential limitations, and one of the main limitations is that it cannot handle URLs that span multiple packets.
Cross-Packet URLs: The code checks the URL in an HTTP request by parsing a single data packet. If the URL of an HTTP request spans multiple packets, it will only examine the URL in the first packet. This can lead to missing or partially capturing long URLs that span multiple data packets.
To address this issue, a solution often involves reassembling multiple packets to reconstruct the complete HTTP request. This may require implementing packet caching and assembly logic within the eBPF program and waiting to collect all relevant packets until the HTTP request is detected. This adds complexity and may require additional memory to handle cases where URLs span multiple packets.
User-Space Code
The user-space code’s main purpose is to create a raw socket and then attach the previously defined eBPF program in the kernel to that socket, allowing the eBPF program to capture and process network packets received on that socket. Here’s an example of the user-space code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
/* Create raw socket for localhost interface */ sock = open_raw_sock(interface); if (sock < 0) { err = -2; fprintf(stderr, "Failed to open raw socket\n"); goto cleanup; }
/* Attach BPF program to raw socket */ prog_fd = bpf_program__fd(skel->progs.socket_handler); if (setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd))) { err = -3; fprintf(stderr, "Failed to attach to raw socket\n"); goto cleanup; }
sock = open_raw_sock(interface);: This line of code calls a custom function open_raw_sock, which is used to create a raw socket. Raw sockets allow a user-space application to handle network packets directly without going through the protocol stack. The interface parameter might specify the network interface from which to receive packets, determining where to capture packets from. If creating the socket fails, it returns a negative value, otherwise, it returns the file descriptor of the socket sock.
If the value of sock is less than 0, indicating a failure to open the raw socket, it sets err to -2 and prints an error message on the standard error stream.
prog_fd = bpf_program__fd(skel->progs.socket_handler);: This line of code retrieves the file descriptor of the socket filter program (socket_handler) previously defined in the eBPF program. It is necessary to attach this program to the socket. skel is a pointer to an eBPF program object, and it provides access to the program collection.
setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd)): This line of code uses the setsockopt system call to attach the eBPF program to the raw socket. It sets the SO_ATTACH_BPF option and passes the file descriptor of the eBPF program to the option, letting the kernel know which eBPF program to apply to this socket. If the attachment is successful, the socket starts capturing and processing network packets received on it.
If setsockopt fails, it sets err to -3 and prints an error message on the standard error stream.
Capturing HTTP Traffic Using eBPF Syscall Tracepoints
eBPF provides a powerful mechanism for tracing system calls at the kernel level. In this example, we’ll use eBPF to trace the accept and read system calls to capture HTTP traffic. Due to space limitations, we’ll provide a brief overview of the code framework.
// Define a tracepoint at the entry of the accept system call SEC("tracepoint/syscalls/sys_enter_accept") intsys_enter_accept(struct trace_event_raw_sys_enter *ctx) { u64 id = bpf_get_current_pid_tgid(); // ... Get and store the arguments of the accept call bpf_map_update_elem(&active_accept_args_map, &id, &accept_args, BPF_ANY); return0; }
// Define a tracepoint at the exit of the accept system call SEC("tracepoint/syscalls/sys_exit_accept") intsys_exit_accept(struct trace_event_raw_sys_exit *ctx) { // ... Process the result of the accept call structaccept_args_t *args = bpf_map_lookup_elem(&active_accept_args_map, &id); // ... Get and store the socket file descriptor obtained from the accept call __u64 pid_fd = ((__u64)pid << 32) | (u32)ret_fd; bpf_map_update_elem(&conn_info_map, &pid_fd, &conn_info, BPF_ANY); // ... }
// Define a tracepoint at the entry of the read system call SEC("tracepoint/syscalls/sys_enter_read") intsys_enter_read(struct trace_event_raw_sys_enter *ctx) { // ... Get and store the arguments of the read call bpf_map_update_elem(&active_read_args_map, &id, &read_args, BPF_ANY); return0; }
// Helper function to check if it's an HTTP connection staticinlineboolis_http_connection(constchar *line_buffer, u64 bytes_count) { // ... Check if the data is an HTTP request or response }
// Helper function to process the read data staticinlinevoidprocess_data(struct trace_event_raw_sys_exit *ctx, u64 id, conststructdata_args_t *args, u64 bytes_count) { // ... Process the read data, check if it's HTTP traffic, and send events if (is_http_connection(line_buffer, bytes_count)) { // ... bpf_probe_read_kernel(&event.msg, read_size, args->buf); // ... bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU, &event, sizeof(structsocket_data_event_t)); } }
// Define a tracepoint at the exit of the read system call SEC("tracepoint/syscalls/sys_exit_read") intsys_exit_read(struct trace_event_raw_sys_exit *ctx) { // ... Process the result of the read call structdata_args_t *read_args = bpf_map_lookup_elem(&active_read_args_map, &id); if (read_args != NULL) { process_data(ctx, id, read_args, bytes_count); } // ... return0; }
char _license[] SEC("license") = "GPL";
This code briefly demonstrates how to use eBPF to trace system calls in the Linux kernel to capture HTTP traffic. Here’s a detailed explanation of the hook locations and the flow, as well as the complete set of system calls that need to be hooked for comprehensive request tracing:
Hook Locations and Flow
The code uses eBPF Tracepoint functionality. Specifically, it defines a series of eBPF programs and binds them to specific system call Tracepoints to capture entry and exit events of these system calls.
First, it defines two eBPF hash maps (active_accept_args_map and active_read_args_map) to store system call parameters. These maps are used to track accept and read system calls.
Next, it defines multiple Tracepoint tracing programs, including:
sys_enter_accept: Defined at the entry of the accept system call, used to capture the arguments of the accept system call and store them in the hash map.
sys_exit_accept: Defined at the exit of the accept system call, used to process the result of the accept system call, including obtaining and storing the new socket file descriptor and related connection information.
sys_enter_read: Defined at the entry of the read system call, used to capture the arguments of the read system call and store them in the hash map.
sys_exit_read: Defined at the exit of the read system call, used to process the result of the read system call, including checking if the read data is HTTP traffic and sending events.
In sys_exit_accept and sys_exit_read, there is also some data processing and event sending logic, such as checking if the data is an HTTP connection, assembling event data, and using bpf_perf_event_output to send events to user space for further processing.
Complete Set of System Calls to Hook
To fully implement HTTP request tracing, the system calls that typically need to be hooked include:
socket: Used to capture socket creation for tracking new connections.
bind: Used to obtain port information where the socket is bound.
listen: Used to start listening for connection requests.
accept: Used to accept connection requests and obtain new socket file descriptors.
read: Used to capture received data and check if it contains HTTP requests.
write: Used to capture sent data and check if it contains HTTP responses.
The provided code already covers the tracing of accept and read system calls. To complete HTTP request tracing, additional system calls need to be hooked, and corresponding logic needs to be implemented to handle the parameters and results of these system calls.
In today’s complex technological landscape, system observability has become crucial, especially in the context of microservices and cloud-native applications. This article explores how to leverage eBPF technology for tracing the seven-layer protocols, along with the challenges and solutions that may arise in this process. Here’s a summary of the content covered in this article:
Introduction:
Modern applications often consist of multiple microservices and distributed components, making it essential to observe the behavior of the entire system.
Seven-layer protocols (such as HTTP, gRPC, MQTT, etc.) provide detailed insights into application interactions, but monitoring these protocols can be challenging.
Role of eBPF Technology:
eBPF allows developers to dive deep into the kernel layer for real-time observation and analysis of system behavior without modifying or inserting application code.
eBPF technology offers a powerful tool for monitoring seven-layer protocols, especially in a microservices environment.
Tracing Seven-Layer Protocols:
The article discusses the challenges of tracing seven-layer protocols, including their complexity and dynamism.
Traditional network monitoring tools struggle with the complexity of seven-layer protocols.
Applications of eBPF:
eBPF provides two primary methods for tracing seven-layer protocols: socket filters and syscall tracing.
Both of these methods help capture network request data for protocols like HTTP and analyze them.
eBPF Practical Tutorial:
The article provides a practical eBPF tutorial demonstrating how to capture and analyze HTTP traffic using eBPF socket filters or syscall tracing.
The tutorial covers the development of eBPF programs, the use of the eBPF toolchain, and the implementation of HTTP request tracing.
Through this article, readers can gain a deep understanding of how to use eBPF technology for tracing seven-layer protocols, particularly HTTP traffic. This knowledge will help enhance the monitoring and analysis of network traffic, thereby improving application performance and security. If you’re interested in learning more about eBPF and its practical applications, you can visit our tutorial code repository at https://github.com/eunomia-bpf/bpf-developer-tutorial or our website at https://eunomia.dev/tutorials/ for more examples and complete tutorials.
SYSTEM 你将得到一个由三重引号分隔的文档和一个问题。你的任务是只使用提供的文档来回答问题,并引用用来回答问题的文档段落。如果文档不包含回答此问题所需的信息,那么只需写下:“信息不足”。如果提供了问题的答案,必须用引文进行注释。使用以下格式引用相关段落 ({"citation": …})。 USER """<插入文档>"""
# we use LLaMA here, but any GPT-style model will do llama = guidance.llms.Transformers("your_path/llama-7b", device=0)
# we can pre-define valid option sets valid_weapons = ["sword", "axe", "mace", "spear", "bow", "crossbow"]
# define the prompt character_maker = guidance("""The following is a character profile for an RPG game in JSON format. ```json { "id": "{{id}}", "description": "{{description}}", "name": "{{gen 'name'}}", "age": {{gen 'age' pattern='[0-9]+' stop=','}}, "armor": "{{#select 'armor'}}leather{{or}}chainmail{{or}}plate{{/select}}", "weapon": "{{select 'weapon' options=valid_weapons}}", "class": "{{gen 'class'}}", "mantra": "{{gen 'mantra' temperature=0.7}}", "strength": {{gen 'strength' pattern='[0-9]+' stop=','}}, "items": [{{#geneach 'items' num_iterations=5 join=', '}}"{{gen 'this' temperature=0.7}}"{{/geneach}}] }```""")
# generate a character character_maker( id="e1f491f7-7ab8-4dac-8c20-c92b5e7d883d", description="A quick and nimble fighter.", valid_weapons=valid_weapons, llm=llama )
# set the default language model used to execute guidance programs guidance.llm = guidance.llms.OpenAI("text-davinci-003")
# define the few shot examples examples = [ {'input': 'I wrote about shakespeare', 'entities': [{'entity': 'I', 'time': 'present'}, {'entity': 'Shakespeare', 'time': '16th century'}], 'reasoning': 'I can write about Shakespeare because he lived in the past with respect to me.', 'answer': 'No'}, {'input': 'Shakespeare wrote about me', 'entities': [{'entity': 'Shakespeare', 'time': '16th century'}, {'entity': 'I', 'time': 'present'}], 'reasoning': 'Shakespeare cannot have written about me, because he died before I was born', 'answer': 'Yes'} ]
# define the guidance program structure_program = guidance( '''Given a sentence tell me whether it contains an anachronism (i.e. whether it could have happened or not based on the time periods associated with the entities). ----
{{~! place the real question at the end }} Sentence: {{input}} Entities and dates: {{gen "entities"}} Reasoning:{{gen "reasoning"}} Anachronism:{{#select "answer"}} Yes{{or}} No{{/select}}''')
# execute the program out = structure_program( examples=examples, input='The T-rex bit my dog' )
Below is an example of how you can modify your code to perform an inline hook for the my_function. This is a simplistic approach and works specifically for this case. This is just an illustrative example. For real-world scenarios, a more complex method would need to be employed, considering thread-safety, re-entrant code, and more.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
voidinline_hook(void *orig_func, void *hook_func) { // Store the original bytes of the function. unsignedchar orig_bytes[5]; memcpy(orig_bytes, orig_func, 5);
// Make the memory page writable. mprotect(get_page_addr(orig_func), getpagesize(), PROT_READ | PROT_WRITE | PROT_EXEC);
// Write a jump instruction at the start of the original function. *((unsignedchar *)orig_func + 0) = 0xE9; // JMP instruction *((void **)((unsignedchar *)orig_func + 1)) = (unsignedchar *)hook_func - (unsignedchar *)orig_func - 5;
// Make the memory page executable only. mprotect(get_page_addr(orig_func), getpagesize(), PROT_READ | PROT_EXEC); }
In this example, my_function is the original function that is hooked. my_hook_function is the function that gets called instead of my_function. The inline_hook function performs the actual hook by overwriting the start of my_function with a jump (JMP) instruction to my_hook_function.
When you now call my_function in your main, my_hook_function is called instead.
Please note that this code is simplified and makes a few assumptions:
The functions my_function and my_hook_function are in the same memory page. If they aren’t, the jump offset from my_function to my_hook_function might not fit in the 4 bytes available in the jump instruction.
The first 5 bytes of my_function can be safely overwritten. If there’s a multi-byte instruction that starts within the first 5 bytes but doesn’t end before the 6th byte, this will crash.
The functions my_function and my_hook_function don’t move in memory. If they do (for example, if they’re in a shared library that gets unloaded and reloaded at a different address), the jump instruction will jump to the wrong place and likely crash.
1 2 3 4 5
$ make $ ./maps Hello, world! Hello from hook! Hello, world!
for arm32
Note that in ARM32, the Program Counter (PC) is usually 2 instructions ahead, which is why we subtract 8 (2 instructions * 4 bytes/instruction) when calculating the offset. This might differ between different ARM versions or modes (Thumb vs ARM, etc.) so please adjust accordingly to your target’s specifics.
Also, you need to increase the SIZE_ORIG_BYTES from 16 to 20 because the minimal branch instruction in ARM is 4 bytes and you’re going to replace 5 instructions. This is needed because the branch instruction uses a relative offset and you cannot be sure how far your hook function will be. If your function and hook are within 32MB of each other, you could only replace the first 4 bytes with a branch and wouldn’t need to touch the rest.
Remember that manipulating code at runtime can be error-prone and architecture-specific. The code can behave differently based on where it’s loaded in memory, how the compiler has optimized it, whether it’s running in Thumb or ARM mode, and so on. Always thoroughly test the code in the exact conditions where it will be used.
1 2 3 4 5
$ make arm $ ./maps-arm32 Hello, world! Hello from hook! Hello, world!
for arm64
Similar to ARM32, ARM64 uses the ARM instruction set. However, there are differences and specifics to consider for ARM64. For example, the encoding of the branch instruction is different and because of the larger address space, you have to create a trampoline for larger offsets that can’t be reached by a single branch instruction. The trampoline should be close to the original function so it can be reached by a branch instruction and from there, it will load the full 64 bit address of the hook function.
1 2 3 4 5
$ make arm64 $ ./maps-arm64 Hello, world! Hello from hook! Hello, world!
With the widespread use of TLS in modern network environments, tracing microservices RPC messages has become increasingly challenging. Traditional traffic sniffing techniques often face limitations in accessing only encrypted data, preventing a genuine observation of the original communication content. This restriction poses significant obstacles to system debugging and analysis.
However, a new solution is now available. Through the use of eBPF technology and its capability to perform probing in user space, a method has emerged to regain plain text data, allowing us to intuitively view the pre-encrypted communication content. Nevertheless, each application might utilize different libraries, and each library comes in multiple versions, introducing complexity to the tracking process.
In this tutorial, we will guide you through an eBPF tracing technique that spans across various user-space SSL/TLS libraries. This technique not only allows simultaneous tracing of user-space libraries like GnuTLS and OpenSSL but also significantly reduces maintenance efforts for new library versions compared to previous methods.
Background Knowledge
Before delving into the main topic of this tutorial, we need to grasp some core concepts that will serve as the foundation for our subsequent discussions.
SSL and TLS
SSL (Secure Sockets Layer): Developed by Netscape in the early 1990s, SSL provides data encryption for communication between two machines on a network. However, due to known security vulnerabilities, SSL has been succeeded by its successor, TLS.
TLS (Transport Layer Security): TLS is the successor to SSL, aiming to provide stronger and more secure data encryption methods. TLS operates through a handshake process during which a client and a server select an encryption algorithm and corresponding keys. Once the handshake is complete, data transmission begins, with all data being encrypted using the chosen algorithm and keys.
Operation Principles of TLS
Transport Layer Security (TLS) is a cryptographic protocol designed to provide security for communication over computer networks. Its primary goal is to provide security, including privacy (confidentiality), integrity, and authenticity, for two or more communicating computer applications over a network using cryptography, such as certificates. TLS consists of two sub-layers: the TLS Record Protocol and the TLS Handshake Protocol.
Handshake Process
When a client connects to a TLS-enabled server and requests a secure connection, the handshake process begins. The handshake allows the client and server to establish security parameters for the connection using asymmetric cryptography. The complete process is as follows:
Initial Handshake: The client connects to the TLS-enabled server, requests a secure connection, and provides a list of supported cipher suites (encryption algorithms and hash functions).
Selecting Cipher Suite: From the provided list, the server chooses a cipher suite and hash function it also supports and notifies the client of the decision.
Providing Digital Certificate: Usually, the server then provides identity authentication in the form of a digital certificate. This certificate includes the server’s name, trusted certificate authorities (guaranteeing the certificate’s authenticity), and the server’s public encryption key.
Certificate Verification: The client verifies the certificate’s validity before proceeding.
Generating Session Key: To create a session key for a secure connection, the client has two methods:
Encrypt a random number (PreMasterSecret) with the server’s public key and send the result to the server (only the server can decrypt it with its private key); both parties then use this random number to generate a unique session key for encrypting and decrypting data during the session.
Use Diffie-Hellman key exchange (or its variant, Elliptic Curve DH) to securely generate a random and unique session key for encryption and decryption. This key has the additional property of forward secrecy: even if the server’s private key is exposed in the future, it can’t be used to decrypt the current session, even if a third party intercepts and records the session.
Once these steps are successfully completed, the handshake process concludes, and the encrypted connection begins. This connection uses the session key for encryption and decryption until the connection is closed. If any of the above steps fail, the TLS handshake fails, and the connection won’t be established.
TLS in the OSI Model
TLS and SSL don’t perfectly align with any single layer of the OSI model or the TCP/IP model. TLS “runs over some reliable transport protocol (such as TCP),” which means it sits above the transport layer. It provides encryption for higher layers, typically the presentation layer. However, applications using TLS often consider it the transport layer, even though applications using TLS must actively control the initiation of TLS handshakes and the handling of exchanged authentication certificates.
eBPF and uprobes
eBPF (Extended Berkeley Packet Filter): It’s a kernel technology that allows users to run predefined programs in the kernel space without modifying kernel source code or reloading modules. It creates a bridge that enables interaction between user space and kernel space, providing unprecedented capabilities for tasks like system monitoring, performance analysis, and network traffic analysis.
uprobes are a significant feature of eBPF, allowing dynamic insertion of probe points in user space applications, particularly useful for tracking function calls in SSL/TLS libraries.
User-Space Libraries
The implementation of the SSL/TLS protocol heavily relies on user-space libraries. Here are some common ones:
OpenSSL: An open-source, feature-rich cryptographic library widely used in many open-source and commercial projects.
BoringSSL: A fork of OpenSSL maintained by Google, focusing on simplification and optimization for Google’s needs.
GnuTLS: Part of the GNU project, offering an implementation of SSL, TLS, and DTLS protocols. GnuTLS differs from OpenSSL and BoringSSL in API design, module structure, and licensing.
OpenSSL API Analysis
OpenSSL is a widely used open-source library providing a complete implementation of the SSL and TLS protocols, ensuring data transmission security in various applications. Among its functions, SSL_read() and SSL_write() are two core API functions for reading from and writing to TLS/SSL connections. In this section, we’ll delve into these functions to help you understand their mechanisms.
1. SSL_read Function
When we want to read data from an established SSL connection, we can use the SSL_read or SSL_read_ex function. The function prototype is as follows:
SSL_read and SSL_read_ex attempt to read up to num bytes of data from the specified ssl into the buffer buf. Upon success, SSL_read_ex stores the actual number of read bytes in *readbytes.
2. Function SSL_write
When we want to write data into an established SSL connection, we can use the SSL_write or SSL_write_ex functions.
SSL_write and SSL_write_ex will write up to num bytes of data from the buffer buf into the specified ssl connection. Upon success, SSL_write_ex will store the actual number of written bytes in *written.
Writing eBPF Kernel Code
In our example, we use eBPF to hook the ssl_read and ssl_write functions to perform custom actions when data is read from or written to an SSL connection.
Data Structures
Firstly, we define a data structure probe_SSL_data_t to transfer data between kernel and user space:
structprobe_SSL_data_t { __u64 timestamp_ns; // Timestamp (nanoseconds) __u64 delta_ns; // Function execution time __u32 pid; // Process ID __u32 tid; // Thread ID __u32 uid; // User ID __u32 len; // Length of read/write data int buf_filled; // Whether buffer is filled completely int rw; // Read or Write (0 for read, 1 for write) char comm[TASK_COMM_LEN]; // Process name __u8 buf[MAX_BUF_SIZE]; // Data buffer int is_handshake; // Whether it's handshake data };
Hook Functions
Our goal is to hook into the SSL_read and SSL_write functions. We define a function SSL_exit to handle the return values of these two functions. This function determines whether to trace and collect data based on the current process and thread IDs.
In SSL/TLS, the handshake is a special process used to establish a secure connection between a client and a server. To analyze this process, we hook into the do_handshake function to track the start and end of the handshake.
Entering the Handshake
We use a uprobe to set a probe for the do_handshake function:
Use trace_allowed to recheck if tracing is allowed.
Look up the timestamp in the start_ns map for calculating handshake duration.
Use PT_REGS_RC(ctx) to get the return value of do_handshake and determine if the handshake was successful.
Find or initialize the probe_SSL_data_t data structure associated with the current thread.
Update the data structure’s fields, including timestamp, duration, process information, etc.
Use bpf_perf_event_output to send the data to user space.
Our eBPF code not only tracks data transmission for ssl_read and ssl_write but also focuses on the SSL/TLS handshake process. This information is crucial for a deeper understanding and optimization of the performance of secure connections.
Through these hook functions, we can obtain data regarding the success of the handshake, the time taken for the handshake, and related process information. This provides us with insights into the behavior of the system’s SSL/TLS, enabling us to perform more in-depth analysis and optimization when necessary.
User-Space Assisted Code Analysis and Interpretation
In the eBPF ecosystem, user-space and kernel-space code often work in collaboration. Kernel-space code is responsible for data collection, while user-space code manages, processes, and handles this data. In this section, we will explain how the above user-space code collaborates with eBPF to trace SSL/TLS interactions.
1. Supported Library Attachment
In the provided code snippet, based on the setting of the env environment variable, the program can choose to attach to three common encryption libraries (OpenSSL, GnuTLS, and NSS). This means that we can trace calls to multiple libraries within the same tool.
To achieve this functionality, the find_library_path function is first used to determine the library’s path. Then, depending on the library type, the corresponding attach_ function is called to attach the eBPF program to the library function.
This section primarily covers the attachment logic for the OpenSSL, GnuTLS, and NSS libraries. NSS is a set of security libraries designed for organizations, supporting the creation of secure client and server applications. Originally developed by Netscape, they are now maintained by Mozilla. The other two libraries have been introduced earlier and are not reiterated here.
We further examine the attach_ function and can see that they both use the ATTACH_UPROBE_CHECKED and ATTACH_URETPROBE_CHECKED macros to implement specific mounting logic. These two macros are used respectively for setting uprobe (function entry) and uretprobe (function return).
Considering that different libraries have different API function names (for example, OpenSSL uses SSL_write, while GnuTLS uses gnutls_record_send), we need to write a separate attach_ function for each library.
For instance, in the attach_openssl function, we set up probes for both SSL_write and SSL_read. If users also want to track handshake latency (env.latency) and the handshake process (env.handshake), we set up a probe for SSL_do_handshake.
In the eBPF ecosystem, perf_buffer is an efficient mechanism used to transfer data from kernel space to user space. This is particularly useful for kernel-space eBPF programs as they can’t directly interact with user space. With perf_buffer, we can collect data in kernel-space eBPF programs and then asynchronously read this data in user space. We use the perf_buffer__poll function to read data reported in kernel space, as shown below:
To display data in hexadecimal format, execute the following command:
1 2 3 4 5 6 7 8
$ sudo ./sslsniff --hexdump WRITE/SEND 0.000000000 curl 16104 24 ----- DATA ----- 505249202a20485454502f322e300d0a 0d0a534d0d0a0d0a ----- END DATA -----
...
Summary
eBPF is a very powerful technology that can help us gain deeper insights into how a system works. This tutorial is a simple example demonstrating how to use eBPF to monitor SSL/TLS communication. If you’re interested in eBPF technology and want to learn more and practice further, you can visit our tutorial code repository at https://github.com/eunomia-bpf/bpf-developer-tutorial and tutorial website at https://eunomia.dev/zh/tutorials/.
Your task is to devise up to 5 highly effective goals and an appropriate role-based name (_GPT) for an autonomous agent, ensuring that the goals are optimally aligned with the successful completion of its assigned task.
The user will provide the task, you will provide only the output in the exact format specified below with no explanation or conversation.
Example input: Help me with marketing my business
Example output: Name: CMOGPT Description: a professional digital marketer AI that assists Solopreneurs in growing their businesses by providing world-class expertise in solving marketing problems for SaaS, content products, agencies, and more. Goals: - Engage in effective problem-solving, prioritization, planning, and supporting execution to address your marketing needs as your virtual Chief Marketing Officer.
- Provide specific, actionable, and concise advice to help you make informed decisions without the use of platitudes or overly wordy explanations.
- Identify and prioritize quick wins and cost-effective campaigns that maximize results with minimal time and budget investment.
- Proactively take the lead in guiding you and offering suggestions when faced with unclear information or uncertainty to ensure your marketing strategy remains on track.
更常见的方案是引入人类监督和交互。可以通过人类每隔一段时间,或者在有需要的时候去监督一下 AI 的执行情况,并确保AutoGPT的行为符合现实世界的商业惯例和法律要求。如果不符合人类的意图的话,通过对话可以对 Agent 进行调整,要求它做更符合人类意图的事情(实际上这在多人合作完成任务的场景下,例如公司或组织中,也是非常常见的)。但相对来说,这种方式经常低效而且缓慢:如果我需要监督 AI 才能保证它不出错,为什么我不自己做呢?
有没有更好的方式?
但实际上,在现实世界中,也许我们有更好的方式让 AI agent 和人类的意图对齐。想象这样的场景:你希望一个对某件复杂任务的流程不了解的人,完成一项特定任务:例如上手一个代码项目的开发和环境配置,学习一门新的编程语言,或者编写一本长篇小说、分析某个商业投资的可行性等等。也许这种情况下,我们可以有一本手册或者教程,它并不需要是精确的、一步一步的指令,但是它会包含一个大致的流程和任务分解,让人类能够快速上手完成对应的任务。那么,我们为什么不能用非常轻松的方式,给 AI 一些大概的指示和任务描述,让它根据这些任务来完成对应的工作呢?
相比 AutoGPT,我们实际上需要的是:
更强的可控性,让它和人类意图进行充分的对齐;
比 CoT(思维链)走的更远,让 AI 能够完成更加复杂的任务,同时不仅限于一步步执行,还需要有递归、循环、条件判断等等特性;
根据 wiki 百科,计算机程序(Computer Program)可以定义为指一组指示电子计算机或其他具有消息处理能力的电子设备每一步动作的指令序列。也许,某种意义上它也是一种 “程序”,但并不是传统的编程语言:自然语言适合模糊化、灵活、可高效扩展的需求,而传统的程序设计语言实际上是一种精确的抽象和计算,二者缺一不可,它们可以相互转化,但是并不一定需要是从自然语言描述转化为精确的计算机指令。未来,每个人都可以是程序员,只要能够用自然语言描述出对应的需求和步骤,无论清晰或者模糊。
自然语言编程
自然语言编程不应该是:
1 2 3 4 5 6 7 8 9 10 11 12
+++ proc1 -- Return five random emojis +++
+++ proc2 -- Modify proc1 to return random numbers instead -- Let $n = [the number of countries inLatinAmerica] -- Insteadof five, use $n /execute proc1 +++
自然语言编程不是,也不应该是通常意义上的编程语言编程。我们并不进行自然语言到代码的转换,没有确定的语法、语言和编程范式,大语言模型就是我们的解释器、CPU 和内存;自然语言适合处理需求模糊、信息密度高的应用,代码适合处理精确、可靠的部分。或者说,自然语言编程实际上是 prompt engineering 的高阶形态,自然语言的指令不再局限于单次和 AI 的交互上下文,并且希望能够借助它来增加和扩展 AI 完成复杂推理、复杂任务执行的能力。