This series of study notes was written by the author to record the learning process of cloud native infrastructure.If you want to learn envoy in detail, you can go to the tetraate official website and official documents to study.
If you don’t have any experience in cloud native, you can learn about the following concepts before learning related concepts:
- kubernetes (this is the basis of cloud native)
- What is the microservice architecture (microservices, RPC, service discovery, etc.)
- What is a Service Mesh?
0. What is Envoy?
If you are cloud-native-related devops engineers, you should know more or less about envoy.
Today, with the industry using microservice architecture and cloud-native solutions, everything is moving closer to the "cloud", so for the cloudTraffic governanceandNetwork debuggingIt's a complicated but necessary thing.
Each microservice may use inconsistent traffic tracking and logging mechanisms, and for developers, the complexity of debugging is greatly increased. For this kind of situation similar to many-to-many mapping, it is difficult for us to determine where the problem occurs and how to solve it. This is even more true if you are a business developer and debugging network problems is not what you are good at.
Retrieve network problems from the application stack and get another component to handle the network part, making debugging network problems easier.That's what Envoy does.
In other words, the microservice you write does not need to focus on a series of program logic related to network and traffic problems (such as interface rate limiting, authentication verification, timeout retry, etc.) and are all extracted and configured in envoy to help you complete it. At the same time, envoy will do a good job in traffic tracking and recording. When multiple envoys work together, they can collect these traffic information through a unified control surface, and the governance effect will be significant.
1. Two modes of Envoy
envoy is a proxy server application in the cloud native era, similar to nginx, but it is better than traditional proxy servers like nginx.
It has two deployment methods, one is a deployment method that suits its essential work:Edge Agent, In this case, envoy deployment is used to build API gateways, which have similar functions as normal gateways.
Another way to deploy is:Sidecar deployment (sidecar), In this case, we said in the previous article, "extract" the network-level work from the application and run it alone. When deploying sidecar, envoy is deployed together with application services. Envoy and the application form an atomic entity, but are still independent processes.The application handles business logic, while Envoy handles network problems.(The most common example is in k8s, which is to put two containers in a pod, one running business code, and the other is envoy. The traffic of the business code container inside the pod will first pass through the envoy container, and then be transmitted from envoy to outside the pod)
The method of sidecar deployment is used to organize the current service governance killer in cloud native - the service grid. But it will not be mentioned at present, and the learning of the edge proxy model is first learned later.
2. Advanced features of Envoy
2.1 Out-of-process architecture
What we want to talk about here is the benefits of envoy's sidecar deployment method. The introduction to this part is written in the tetrate study materials, so I will put the original text (I will comment on some blurry places and mark them in italics.):
Envoy is a standalone process designed to run with each application—that is, the Sidecar deployment mode we mentioned earlier. The centrally configured collection of Envoys forms a transparent service mesh.
The responsibility for routing and other networking capabilities is pushed to Envoy. app(When DNS service discovery is performed)Send a request to a virtual address (localhost) instead of a real address (such as a public IP address or hostname), without knowing the network topology. The application no longer assumes responsibility for routing because the task is delegated to an external process.(The application traffic will first reach the port that envoy listens. The principle is that envoy will modify the iptables rules at the L4 layer to perform traffic hijacking, and redirect outbound and inbound traffic to the ports listened to by Envoy.)
Rather than letting an application manage its network configuration, it is better to manage network configuration independently of the application at the Envoy level. In an organization, this can free application developers and focus on the business logic of the application.
Envoy is suitable for any programming language. You can write your application in Go, Java, C++, or any other language, and Envoy can bridge them. Envoy behaves the same, regardless of the programming language of the application or the operating system they run.
Envoy can also be deployed and upgraded transparently throughout the infrastructure. This can be very painful and time-consuming compared to upgrading for each individual application deployment library.
Off-process architecture is beneficial because it keeps us consistent across different programming languages/application stacks, and we can get independent application lifecycles and all Envoy networking capabilities for free without having to solve these problems separately in each application.
"The centrally configured collection of Envoys forms a transparent service mesh." The core goal of the service mesh isSolve the complexity of inter-service communication in microservice architecture, and the interoperability problem in multi-lingual environments is one of its important application scenarios. envoy fits this task well.
2.2 L3/L4 filter structure
Envoy is an L3/L4 network proxy that makes decisions based on the IP address and TCP or UDP ports. It has aPlugable filter chain(like a pipeline on linux), you can write your filters to perform different TCP/UDP tasks.
Envoy can build logic and behavior by stacking the required filters to form a filter chain. There are many ready-made filters that support tasks such as original TCP proxy, UDP proxy, HTTP proxy, TLS client authentication, etc. Envoy is also scalable, we can write our filters. (Especially now combined with wasm technology)
2.3 L7 filter structure
Envoy supports an additional HTTP L7 filter layer. We canHTTP Connection Management Subsystem (HCM)Insert HTTP filters into them to perform different tasks such as cache, rate limiting, routing/forwarding, etc.
2.4 HTTP-related support
HTTP2 support
Envoy supports both HTTP/1.1 and HTTP/2 and can operate as a transparent two-way proxy from HTTP/1.1 to HTTP/2. This means that any combination of HTTP/1.1 and HTTP/2 clients and target servers can be bridged. Even if your traditional applications are not communicating over HTTP/2, they will eventually communicate over HTTP/2 if you deploy them next to the Envoy proxy.
It is recommended to configure Envoy between all services using HTTP/2 to create a persistent connection grid where requests and responses can be reused.
HTTP routing
Envoy supports routing subsystems when operating in HTTP mode and using RESTHCM, able to route and redirect requests based on rules such as path, permissions, content type, and runtime value.
Building Envoy asAPI GatewayThis feature is very useful when building a service mesh (sidecar deployment mode).
2.5 gRPC support
Envoy supports all HTTP/2 features required as the underlying routing and load balancing of gRPC requests and responses.
gRPC is an open source remote procedure call (RPC) system that uses HTTP/2 for transmission and uses protocol buffers as interface description language (IDL). It provides features including authentication, bidirectional flow and flow control, blocking/non-blocking binding, and canceling and timeouts.
2.6 Service discovery and dynamic configuration
We can use static configuration files to configure Envoy, which describe the communication between services, and we will learn more about it later.
For advanced scenarios where static configuration Envoy is not realistic, Envoy supportsDynamic configuration, automatically reload the configuration at runtime.
A group of namesxDSThe discovery service can be used to dynamically configure Envoy over the network and provide Envoy with information about hosts, cluster HTTP routing, listening sockets, and encryption.
But this is related to sidecar deployment and service mesh. Simply put, there is a control plane that dynamically collects information in the current microservice cluster, and then converts this information into various types, and then sends it to each sidecar envoy.
2.7 Health Check
Load balancers (LBs) have a feature that only routes traffic to healthy and available upstream services.
Envoy supports Health Checking to perform upstream service clustersinitiativeHealth check. Health checks include many types, including HTTP, TCP, gRPC, etc., which are similar to sending heartbeat packets regularly.
(But the backend needs to implement corresponding heartbeat packet response logic, such as HTTP active heartbeat checking, and the backend should implement a /health interface to respond to 200 state.)
(But if Envoy completely relies on the service mesh EDS mechanism (by listening to the Endpoints state of Kubernetes), and Kubernetes' readingProbe can accurately reflect the health status of the application, Envoy basically does not need to configure an active health check. We will continue to learn later.)
Envoy then uses a combination of service discovery and health check information to determine healthy load balancing goals.
Envoy can also be supported through the Outlier Detection subsystem (Outlier Detection)passiveHealth check.
There is a simple summary table, it doesn't matter if you can't understand it for the time being:
Scene | Is Envoy active check required | reason |
---|---|---|
Pure Kubernetes + EDS | no | Kubernetes probes are sufficient, Envoy relies on EDS state routing |
In-depth health checks are required | yes | Overlay Envoy checks to supplement the inadequacy of Kubernetes probes |
Service mesh (such as Istio) | no | Relying on Kubernetes probe + passive anomaly detection, control plane guarantees EDS data consistency |
Hybrid environment (K8s + non-K8s service) | Some need | Non-Kubernetes services need to be checked through Envoy |
2.8 Advanced load balancing
Envoy supports automatic retry, circuit breaking, global rate limiting (using external rate limiting services), shadow requests (or traffic mirroring), exception point detection, and request hedging.
2.10 TLS Termination
The decoupling of applications and agents enables TLS termination (two-way TLS) between all services in the mesh deployment model.
Envoy can act as a proxy to handle the establishment, decryption and verification of TLS encrypted connections without the need for backend applications to directly participate in TLS handshake or encrypted communications. In this way, the business code only needs to process plaintext traffic, and there is no need for business development to manage certificates and keys in the application.
In a service mesh (such as Istio), Envoy acts as a Sidecar proxy,Centrally manage TLS encryption and identity authentication。
2.11 Observability
For ease of observation, Envoy generates logs, metrics, and tracking. Envoy currently supports statsd (and compatible providers) as statistics for all subsystems. Thanks to scalability, we can also insert different statistics providers when needed.
In a service mesh such as istio, the information collected by envoy is visualized by the classic Prometheus + Grafana combination.
2.12 HTTP/3 (Alpha)
Envoy 1.19.0 supports uplink and downlink of HTTP/3, and performs bidirectional escapes between HTTP/1.1, HTTP/2 and HTTP/3.
3. Envoy architecture diagram
The rough envoy architecture is given below (relative to sidecar deployment, where the control plane is the service mesh thing)
For a specific introduction to the filter structure, please see the [Request Flow Module] in the next chapter.
4. Envoy's build module
On the official website of tetrate, Envoy's basic building blocks include four parts:
But this building module is too abstract and not easy to understand. I will put the more specific three-layer and four-zone abstraction below:
Take a typical HTTP request as an example:
- The client initiates an HTTP request to the port that Envoy listens to
- Listening and receiving layers: The listener accepts the connection and selects the filter chain; the network filter recognizes the HTTP protocol
- Processing and routing layer: HTTP Connection Manager (HCM) parsing requests, HTTP filter chain processing requests (authentication, current limiting, etc.), routing filter determines the target cluster
- Upstream service layer: The load balancer in the cluster selects a healthy endpoint to which the request is forwarded
- The backend service handles the request and returns the response, and the response returns to the client along the original path.
- Control and Observation Area: During the entire process, configurations may be updated dynamically, and requests are recorded, counted and tracked
We mainly focus on three levels, mainly on static resources, and we will introduce to learn how to configure dynamic resources later.
Then let's start withListenerI started to introduce each building module, and the idea was to follow the learning route of the official website. The abstraction of the third-level and four zones is only more convenient to understand.
4.1 Listener Listener
The listener exposed by Envoy is a named network location, which can be an IP address and a port, or a Unix domain socket path. Envoy receives connections and requests through listeners. Consider the Envoy configuration below.
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 10000
filter_chains: [{}]
Through the Envoy configuration above, we0.0.0.0
Address10000
A namedlistener_0
listener. This means Envoy is listening0.0.0.0:10000
incoming request.
Each listener has different parts to configure. However, the only setting required is the address. The above configuration is valid, you can use it to run Envoy - although it is useless, because all connections will be closed. Because we letNetwork filter chain filter_chains
The field is empty because no additional operations are required after receiving the packet.
In order to enter the next artifact (router), we need to create one or morefilter_chains
, at least one filter is required.
4.2 FilterChain filter chain mechanism
Each request coming in through the listener can flow through multiple filters. We can write a configuration that selects different filter chains based on incoming request or connection properties. Filter chains can be used to enter and leave Envoy'sRequests and responsesProcessing (note that there are back and forth, the filter chain does not just process the request).
Let’s first introduce the listener categories in Envoy. Remember to combine them with the third-layer and four-zone diagram!
Envoy defines three types of filters: listener filter, network filter, and HTTP filter.Note that the three types of filters form a hierarchical request processing pipeline, rather than parallel components.
-
Listener filterStart immediately after receiving the packet, usually operating on the header information of the packet. Listener filters include:
- Agent Listener Filter (Extract PROXY Protocol Header)
- TLS inspector listener filter (checks if traffic is TLS, if so, extracts data from the TLS handshake).
-
Network Filter (NetworkFilter)Usually, the payload of a data packet is operated on, viewed and parsed. For example, PostgreSQL network filter parses the body of a packet, checking the type of database operations or the results it carries.
- A special, built-in network filter is calledHTTP Connection ManagerFilter (HTTP Connection Manager Filter) orHCM. The HCM filter is able to convert raw bytes to HTTP-level messages. It can process access logs, generate request IDs, operate headers, manage routing tables, and collect statistics. We will introduce HCM in more detail in future studies.
-
HTTP filter, just like we can define multiple network filters for each listener (one of which is HCM), Envoy also supports defining multiple HTTP-level filters in HCM filters. We can be named
http_filters
Define these HTTP filters under the fields of .
In general, it's like this:
- Listener has its own listener filter, and in addition to that, a Listener contains one or more FilterChain.
- A FilterChain contains one or more NetworkFilters.
- HCM is one of the NetworkFilters, and HTTPFilters form HTTPFilterChain in HCM
- RouterFilter is one of the HTTPFilters that are placed at the end of HTTPFilterChain.
Note that in the configuration file in 4.1, we only define one listener (
listener_0
) and an empty filter chainfilter_chains: [{}]
, but this is part of the Network Filter Chain, not the listener filter. Listener filters need to be used in configurationlistener_filters
Field definition.
4.3 Router
The last filter in the HTTP filter chain must be a router filter (). The router filter is responsible for performing routing tasks. This ultimately brings us to the second component—routing。
Domain name matching
We're inHCM filterofroute_config
Define the routing configuration under the field. In the routing configuration, we can match incoming requests by viewing metadata (URI, Header, etc.), and on this basis, define the sending location of the traffic.
The top-level element in the routing configuration isVirtual Hostvirtual_hosts
. Each virtual host element has its ownname
, used when publishing statistics (not used for routing), and a set of domain names that are routed to itdomain
。
Let's consider the following collection of routing configurations and domain names.
route_config:
name: my_route_config
virtual_hosts:
- name: cnblogs_hosts
domains: [""]
routes:
...
- name: test_hosts
domains: ["", ""]
routes:
...
If the destination of the incoming request is(i.e. in HTTP request
Host/Authority
The header is set to one of the values), and the request will gocnblogs_hosts
The route defined in the virtual host.
Similarly, ifHost/Authority
The header containsor
, then the request will go
test_hosts
Routing under the virtual host. This way we can use a listener (0.0.0.0:10000
) to handle multiple top-level domains.
If you specify multiple fields in an array, the search order is as follows:
- Exact domain names (for example:
)。
- Suffix domain name wildcard characters (such as
*.)
。 - Prefix domain wildcards (for example:
cnblogs.*
)。 - Special wildcards matching any domain (
*
)。
Routing Matching
After Envoy matches the domain name, it is time to handle the selected virtual hostroutes
Field is here. We use this field to specify how a request is matched and how to process it next (e.g., redirect, forward, rewrite, send direct response, etc.).
Let's take a look at an example.
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: .http_connection_manager
typed_config:
"@type": /.http_connection_manager.
stat_prefix: hello_world_service
http_filters:
- name:
typed_config:
"@type": /.
route_config:
name: my_first_route
virtual_hosts:
- name: direct_response_service
domains: ["*"]
routes:
- match:
prefix: "/"
direct_response:
status: 200
body:
inline_string: "envoy yay"
The top part of the configuration is the same as we've seen before. We have added HCM filters, statistics prefixes (hello_world_service
), a single HTTP filter (router), and routing configuration.
We'llroute_config
Take out the part:
route_config:
name: my_first_route
virtual_hosts:
- name: direct_response_service
domains: ["*"]
routes:
- match:
prefix: "/"
direct_response:
status: 200
body:
inline_string: "envoy yay"
In the virtual host, we want to match any domain name (wildcard character)*
Specified). existroutes
Next, we match the prefix (/
), and then we can send a response.
When it comes to matching requests, we have multiple options.
Routing Matching | describe | Example |
---|---|---|
prefix |
The prefix must be with:path The beginning of the head matches. |
/hello and/hello 、/helloworld and/hello/v1 match. |
path |
The path must be with:path The header exactly matches. |
/hello match/hello , but not match/helloworld or/hello/v1
|
safe_regex |
The provided regular expression must be:path Header matching. |
/\{3} Match any/ The three-digit number at the beginning. For example,/123 Match, but not/hello or/54321。
|
connect_matcher |
The matcher only matches CONNECT requests. |
Once Envoy matches the request to the route, we can route, redirect or return a direct response. In this example, wedirect_response
Configure field usageDirect response。
During the teaching process of tetraate, they used a name calledfunc-ecommand line tool.func-e allows us to select and use different Envoy versions.
See func-e official website for the way to install func-e CLI.
Of course, you can also download the envoy bin file directly to run.
The more troublesome way is to download the envoy source code yourself to compile. However, the compilation environment requirements are very high (it is recommended to use the compilation environment mirror to do this), and the compilation time will be very, very long...
Now save the above configuration toAfter the installation, we can run Envoy with this configuration.
func-e run -c
If it is a binary envoy pulled down by itself, then:
envoy-1.27.0-linux-x86_64 -c
Once Envoy is started, we canlocalhost:10000
Send a request to get the direct response we configured.
$ curl localhost:10000
envoy yay
Similarly, if we add a different host header (e.g.-H "Host: "
) will get the same response becauseThe host matches the domain defined in the virtual host.
4.4 Cluster and Endpoint
In most cases, sending a response directly from the configuration is a good feature, but we have a set of endpoints or hosts to which we route traffic. The way to do this in Envoy is by definingCluster。
Cluster isA groupUpstream hosts that accept traffic. This can be the host or IP address your service listens toList(Similar to the relationship between Service and EndpointSlice in k8s)
For example, suppose we have a hello world service listening127.0.0.0:8000
. We can then create a cluster with a single endpoint, like this:
static_resources:
clusters:
- name: hello_world_service
connect_timeout: 0.25s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
hosts:
- socket_address:
address: 127.0.0.1
port_value: 8000
The definition of the cluster is at the same level as the definition of the listener,useclusters
Field. When we reference the cluster in the routing configuration and when exporting statistics, we use this field section of the cluster. The cluster name must be unique across all clusters.
We can also define the endpoints inside the clusterLoad balancing:
clusters:
- name: hello_world_service
load_assignment:
cluster_name: hello_world_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8000
existload_assignment
Under the field, we can define a list of endpoints to load balancing and the load balancing policy settings. Eachlb_endpoints
Contains one or more endpoints, see 4.5 for the endpoint section.
Envoy supports multiple load balancing algorithms (round-robin, Maglev, least-request, random). These algorithms are configured through a series of settings, including: static boot configuration, DNS, dynamic xDS (CDS and EDS services), and active/passive health checks. If we don't passlb_policy
The field explicitly sets the load balancing algorithm, which defaults to round-robin.
(Of course, the envoy load balancing here is just an understanding, and there will be separate notes to study in depth later.)
We can first understand some features of endpoint partial load balancing support. Typically, a load balancer treats all endpoints equally, but the cluster definition allows a hierarchy to be established within the endpoint.
For example, an endpoint can have aWeightProperties, which will indicate that the load balancer sends more/less traffic to these endpoints than other endpoints.
Another hierarchy type is based onLocalityProperties, commonly used to define a failover architecture, are useful in disaster recovery and ordinary business optimization. This hierarchy allows us to define geographically closer "preferred" endpoints, as well as "backup" endpoints that should be used if the "preferred" endpoints become unhealthy.
We can also configure the following optional features in Cluster:
- Active health checkup (
health_checks
) - Circuit breaker (
circuit_breakers
) - Anomaly detection (
outlier_detection
) - There are additional protocol options when handling upstream HTTP requests
- A set of optional network filters that are applied to all outbound connections, etc.
Like the listener's address, the endpoint address can be a socket address or a Unix domain socket.
4.5 Example combined with yaml
Let's see how these configurations are combined.
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: .http_connection_manager
typed_config:
"@type": /.http_connection_manager.
stat_prefix: hello_world_service
http_filters:
- name:
route_config:
name: my_first_route
virtual_hosts:
- name: direct_response_service
domains: ["*"]
routes:
- match:
prefix: "/"
route:
cluster: hello_world_service
clusters:
- name: hello_world_service
connect_timeout: 5s
load_assignment:
cluster_name: hello_world_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8000
Unlike the previous yaml, we added the cluster configuration and did not use itdirect_response
, but useroutes
field and specify the cluster name.
We run a golang hello-world service locally to verify this configuration and listen to port 8000:
package main
import (
"fmt"
"net/http"
)
func helloWorld(w , r *) {
(w, "Hello, World!")
}
func main() {
("/", helloWorld)
("Starting server at port 8000")
if err := (":8000", nil); err != nil {
(err)
}
}
Then we can127.0.0.1:8000
Send a request to check if we get a response from "Hello World".
Next, let's send the above saved Envoy configuration to Envoy to start.
envoy-1.27.0-linux-x86_64 -c
When the Envoy agent starts,0.0.0.0:10000
Send a request to have the Envoy proxy request to the hello world endpoint.
$ curl -v 0.0.0.0:10000
* Trying 0.0.0.0:10000...
* Connected to 0.0.0.0 (0.0.0.0) port 10000
* using HTTP/
> GET / HTTP/1.1
> Host: 0.0.0.0:10000
> User-Agent: curl/8.12.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
< date: Tue, 01 Apr 2025 12:23:00 GMT
< content-length: 13
< content-type: text/plain; charset=utf-8
< x-envoy-upstream-service-time: 0
< server: envoy
<
* Connection #0 to host 0.0.0.0 left intact
Hello, World!
From the lengthy output, we will notice the response header set by the Envoy proxyx-envoy-upstream-service-time
andserver: envoy
。
5. Some tools needed to learn
- Lightweight pressure testing tool hey:/rakyll/hey
- docker and python3
- About GLIBC: My envoy is based on version 1.27 and requires GLIBC of 2.29 or above. For example, Tencent Cloud's centos server is generally only 2.28, so please pay attention to this.
- If you want to understand the envoy feature in the early stage, try not to waste time on manual compilation. Go directly to download the binary file, my version 1.27:/envoyproxy/envoy/releases/tag/v1.27.0, the latest is 1.33