This article was shared from Huawei Cloud CommunityKmesh v0.5, the kernel-level traffic governance engine, is released! Attack of the Sidecarless Service Grid"., by Cloud Containers for a Big Future.
We are very pleased to announce the release of Kmesh v0.5.0. First of all, thanks to our contributors for their hard work over the past two months. In v0.5.0, we made a number of important enhancements, including the command line tool kmeshctl, more comprehensive end-to-end test coverage, visualization improvements for underlying eBPF information, observability enhancements, full reboot support, improvements to the CNI installer, and RBAC support in XDP programs. In addition, during this release cycle we fixed many critical bugs, refactored some key code, and added more test coverage to make Kmesh more stable and robust.
Kmesh Background Review
Although the service grid represented by Istio has gained widespread attention and achieved significant visibility over the past few years, the Sidecar model, which the Istio community used to focus on promoting, can have a significant impact on workloads in terms of resource overhead and data link latency, so users are still relatively cautious in choosing their landing solution. In addition, one of the main drawbacks of the Sidecar model is that it is strongly bound to the lifecycle of the business container and cannot be upgraded independently. In order to solve these problems, Kmesh innovatively proposes a kernel-based Sidecar-less traffic management solution, which sinks traffic management to the kernel level. Currently, Kmesh supports "Kernel-Native" and "Dual-Engine" modes.
For the "Kernel-Native" model, Kmesh initially relied solely on eBPF and kernel modules for L4-L7 governance, as eBPF technology is well suited for Layer 4 traffic governance and, in combination with programmable kernel modules, enables Layer 7 traffic orchestration. Kmesh utilizes a flow-along governance strategy that does not add additional connection hops to service communication, reducing the number of connections between services from three to one compared to the Sidecar model. The architecture of the "Kernel-Native" model is shown below:
Meanwhile, in order to enhance Layer 7 protocol governance, this year Kmesh introduces a new governance model, the "Dual-Engine" model, which utilizes eBPF to forward traffic to kmesh-waypoint for advanced Layer 7 protocol governance. This is a more flexible layered governance model that can be customized to meet the diverse needs of different users.
Kmesh 0.5 Key Features Explained
Zero downtime on Kmesh reboot
Kmesh can now gracefully reload eBPF Map and programs after a reboot and does not need to re-register namespaces or specific Pods after a reboot.This means that traffic will not be interrupted during a reboot, which is a huge benefit to users. After a kmesh-daemon reboot, the eBPF Map configuration is automatically updated to the latest state.
By pinning the eBPF program to the kernel directory as shown in the above figure, eBPF will still be able to properly govern the traffic after kmesh is shut down, ensuring that the service is not interrupted during a kmesh reboot.
After a kmesh restart, update the config in bpf_map to the latest one by comparing the config stored in bpf_map with the latest one obtained.
In version v0.4.0, a Kmesh restart required a restart of all Kmesh-managed Pods in order to re-manage them, as that management was triggered by the CNI plugin. This process is now done in kmesh-daemon, so the Pods do not need to be restarted to be re-managed.
Increased observability
Kmesh now supports L4 access logging, allowing users to clearly visualize Kmesh-managed traffic. Note that access logging is not enabled by default. You can enable access logging by modifying the (used form a nominal expression)--enable-accesslog parameter to enable access logging. We will also support enabling access logging dynamically using kmeshctl.
An example of an access log is shown below:
accesslog: 2024-09-14 08:19:26.552709932 +0000 UTC
=10.244.0.17:51842, =prometheus-5fb7f6f8d8-h9cts, =istio-system,
=10.244.0.13:9080, =, =productpage-v1-8499c849b9-bz9t9, =echo-1-27855, direction=INBOUND, sent_bytes=5, received_bytes=292, duration=2.733902ms
The meaning of each of these fields is:
A Grafana plugin adapted for Kmesh has also been added to better visualize the monitoring metrics in each dimension. In addition, some key issues with observability have been fixed, improving its accuracy and stability.
Sinking Authorized Execution into XDP Programs
L4 RBAC is now supported in Kmesh in v0.3.0, but the previous solution was to do RBAC in user space, which had some performance and functionality issues. Now that we have sunk it into XDP eBPF, this feature will actually be available. The authentication rules have now been moved to the eBPF Map, which makes it possible to perform authorization entirely within the eBPF program. When authorization results in a denial, the XDP program simply drops the request packet, allowing the client to detect a connection failure.
The key to sinking into the XDP program is the use of eBPF's tail-call mechanism, where different matching rules are strung together via tail-calls, following the logic of the original authentication in user space.
As shown in the above figure, the authentication rules configured in the cluster are written to the eBPF Map through the message subscription mechanism. when the inbound traffic on the Pod builds a chain, it will match the authentication rules in the XDP program, if the authentication result is denied, the packet will be dropped; if the result of the authentication result is permissive, then the traffic will be sent to the corresponding App process through the protocol stack.
Better debugging capabilities
We've added a new command line tool, kmeshctl! Now you don't need to go into the appropriate Kmesh daemon pod to adjust the logging level or dump configuration of the Kmesh daemon. You can just use kmeshctl:
# Adjust the kmesh-daemon logging level (e.g., debug | error | info)
kmeshctl log kmesh-6ct4h --set default:debug
# Dump configuration
kmeshctl dump kmesh-6ct4h workload
More features will be added to kmeshctl in the future to allow users to better manage and debug Kmesh.
Better visualization of the underlying BPF Map
Previously we had the interface/debug/config_dump/ads respond in singing/debug/config_dump/workload to output the contents of the configuration cached in the Kmesh daemon. For various reasons, the configuration in the Kmesh daemon's cache may not exactly match the actual eBPF. If we could get read-friendly eBPF information, it would be more helpful for troubleshooting. Now, we can access the eBPF configuration via the interface/debug/bpf/* Get this information. This information will also be integrated into kmeshctl for easy viewing and can be further extended to determine if the underlying eBPF is synchronized with the configuration in the Kmesh daemon.
# Get eBPF info in dual-engine mode
kubectl exec -ti -n kmesh-system kmesh-6ct4h -- curl 127.0.0.1:15200/debug/config_dump/bpf/workload
# Get eBPF info in kernel-native mode
kubectl exec -ti -n kmesh-system kmesh-6ct4h -- curl 127.0.0.1:15200/debug/config_dump/bpf/ads
Improvements to the CNI installation program
Since the CNI installer is a Kmesh daemon, CNI will not be able to uninstall the CNI configuration if the kmesh-daemon crashes unexpectedly or if the machine suddenly loses power. If the token of kubeconfig is out of date, no Pod can be started successfully after kmesh-daemon exits abnormally. Therefore, we have taken the following two approaches to resolve this issue:
-
existstart_kmesh.sh The CNI configuration is cleaned up at the end of the
-
Add a separate Go co-program to the CNI installer to update the kubeconfig file once the token file has been modified. This ensures that the kubeconfig file is not prone to expiration.
Support for HostNetwork workloads
Now, for the Kmesh dual-engine model, we support accessing services through HostNetwork Pods.
performance enhancement
In the dual-engine model, we avoid cyclic traversal of the BPF Map by using a local cache to optimize BPF Map updates during workload and service response processing.
Critical Bug Fixes
We also fixed some major bugs:
-
Prevent loss of flow control during workload resource updates by not removing the front-end Map.
-
Traffic from the namespace waypoint will be redirected to waypoint again, avoiding a dead loop. Now we've skipped the traffic management from waypoint.
-
Fixed an issue where waypoint would unexpectedly return an HTTP/1.1 400 Bad Request when processing non-HTTP TCP traffic. #681
Acknowledgements to contributors
Kmesh v0.5.0 contains 567 code commits from 14 contributors, and we would like to thank them all:
We have always been open and neutral attitude to the development of Kmesh, and continue to build Sidecarless service grid industry benchmarking program, service thousands of industries, and promote the healthy and orderly development of the service grid.Kmesh is currently in a rapid development stage, we invite the majority of aspiring people to join!
Reference Links
Kmesh Release v0.5.0: /kmesh-net/kmesh/releases/tag/v0.5.0
Kmesh GitHub: /kmesh-net/kmesh
Kmesh Website: /
More Huawei Cloud Cloud Native Dry RecommendationsHuawei Cloud Native King Road Intensive Training Camp
Huawei Cloud Native King Road Intensive Training Camp
In order to help the majority of technology enthusiasts quickly master cloud native-related skills, the Huawei Cloud Native team and Huawei Cloud Academy, in conjunction with CNCF Open Source Software University, have launched a talent development program, launching theHuawei Cloud Native King Road Intensive Training Camp》,From the introduction of cloud native fundamentals to best practices, underlying principles and solution architecture in-depth analysis, layer by layer, to meet the needs of different cloud native technology foundation and learning objectives of the populationThis course is designed to help students to quickly integrate the technology they have learned with the business of the enterprise. This course also selected dozens of typical enterprise application scenarios, as the students on the machine practice cases, to help students will learn the technology quickly and enterprise business combination, service in enterprise production.
Click to attend for freeHuawei Cloud Native King Road Intensive Training Camp:/roadmap/
Be the first to know about Huawei Cloud's fresh technology~!