What is ElasticStack
ElasticStack was named elk
elk represents three components
- ElasticSearch
Responsible for data storage and retrieval. - Logstash
Responsible for data collection and collecting source data to ElasticSearch for storage. - Kibana
Responsible for displaying data. Similar to Granfa
Since Logstash is a heavyweight product with an installation package of more than 300MB+, many students only use log collection, so they use other acquisition tools instead, such as flume, fluentd and other products to replace it.
Later, elastic company also discovered this problem and developed a lot of beats products, typical of which are Filebeat, metricbeat, heartbeat, etc.
Then, for security, related components such as xpack and cloud environments were launched.
Later, the name was named elk stack (elk technology stack). Later, the company promoted ElasticStack.
ElasticStack Architecture
ElasticStack Version
/Elastic official website
The latest version 8+, the https protocol is enabled by default in version 8. We first install version 7.17 and then manually start the https protocol.
Next, practice installing version 8
Choose the elastic installation method, and then deploy elastic on Ubuntu
Binary package deployment stand-alone es environment
deploy
1. Download the elk installation package
root@elk:~# cat install_elk.sh
#!/bin/bash
wget /downloads/elasticsearch/elasticsearch-7.17.28-linux-x86_64.
wget /downloads/elasticsearch/elasticsearch-7.17.28-linux-x86_64..sha512
shasum -a 512 -c elasticsearch-7.17.28-linux-x86_64..sha512
tar -xzf elasticsearch-7.17.28-linux-x86_64. -C /usr/local
cd elasticsearch-7.17.28/
2. Modify the configuration file
root@elk:~# vim /usr/local/elasticsearch-7.17.28/config/
root@elk:~# egrep -v "^#|^$" /usr/local/elasticsearch-7.17.28/config/
: xu-elasticstack
: /var/lib/es7
: /var/log/es7
: 0.0.0.0
: single-node
Related parameter description:
port
The default port is 9200
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
#: 9200
The name of the cluster
The data storage path of ES.
The log storage path of ES.
# elasticStack only allows native access by default
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
#: 192.168.0.1
The address of the ES service listening.
# If the deployed es cluster needs to be configured with discovery.seed_hosts and cluster.initial_master_nodes parameters
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: ["host1", "host2"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2"]
Refers to the deployment type of the ES cluster. Here "single-node" means a single point of environment.
3. If you start elastic directly at this time, an error will be reported
3.1 Test error, official startup command
Elasticsearch can be started from the command line as follows:
./bin/elasticsearch
root@elk:~# /usr/local/elasticsearch-7.17.28/bin/elasticsearch
# These are java types reported errors
Mar 17, 2025 7:44:51 AM <clinit>
WARNING: COMPAT locale provider will be removed in a future release
[2025-03-17T07:44:53,125][ERROR][] [elk] uncaught exception in thread [main]
: : can not run elasticsearch as root
at (:173) ~[elasticsearch-7.17.:7.17.28]
at (:160) ~[elasticsearch-7.17.:7.17.28]
at (:77) ~[elasticsearch-7.17.:7.17.28]
at (:112) ~[elasticsearch-cli-7.17.:7.17.28]
at (:77) ~[elasticsearch-cli-7.17.:7.17.28]
at (:125) ~[elasticsearch-7.17.:7.17.28]
at (:80) ~[elasticsearch-7.17.:7.17.28]
Caused by: : can not run elasticsearch as root
at (:107) ~[elasticsearch-7.17.:7.17.28]
at (:183) ~[elasticsearch-7.17.:7.17.28]
at (:434) ~[elasticsearch-7.17.:7.17.28]
at (:169) ~[elasticsearch-7.17.:7.17.28]
... 6 more
uncaught exception in thread [main]
: can not run elasticsearch as root # root is not allowed to start directly
at (:107)
at (:183)
at (:434)
at (:169)
at (:160)
at (:77)
at (:112)
at (:77)
at (:125)
at (:80)
For complete error details, refer to the log at /var/log/es7/
2025-03-17 07:44:53,713764 UTC [1860] INFO @111 Parent process died - ML controller exiting
3.2 Create a startup user
root@elk:~# useradd -m elastic
root@elk:~# id elastic
uid=1001(elastic) gid=1001(elastic) groups=1001(elastic)
# Start with elastic user, there is an error at this time
root@elk:~# su - elastic -c "/usr/local/elasticsearch-7.17.28/bin/elasticsearch"
could not find java in bundled JDK at /usr/local/elasticsearch-7.17.28/jdk/bin/java
# There is a Java package in the system, but the elastic user cannot find it. Switch to elastic to view it.
root@elk:~# ll /usr/local/elasticsearch-7.17.28/jdk/bin/java
-rwxr-xr-x 1 root root 12328 Feb 20 09:09 /usr/local/elasticsearch-7.17.28/jdk/bin/java*
root@elk:~# su - elastic
$ pwd
/home/elastic
$ ls /usr/local/elasticsearch-7.17.28/jdk/bin/java
# The reason for the error is that the permission is denied, that is, elastic does not have permission to access the java package
ls: cannot access '/usr/local/elasticsearch-7.17.28/jdk/bin/java': Permission denied
# Look out layer by layer, and finally you can find the /usr/local/elasticsearch-7.17.28/jdk/bin directory without permission, resulting in an error
root@elk:~# chown elastic:elastic -R /usr/local/elasticsearch-7.17.28/
root@elk:~# ll -d /usr/local/elasticsearch-7.17.28/jdk/bin/
drwxr-x--- 2 elastic elastic 4096 Feb 20 09:09 /usr/local/elasticsearch-7.17.28/jdk/bin//
# Then start the test and find other errors
# We specify the location and do not exist, we need to create it manually
: Unable to access '' (/var/lib/es7)
: : Unable to access '' (/var/lib/es7)
root@elk:~# install -d /var/{log,lib}/es7 -o elastic -g elastic
root@elk:~# ll -d /var/{log,lib}/es7
drwxr-xr-x 2 elastic elastic 4096 Mar 17 08:01 /var/lib/es7/
drwxr-xr-x 2 elastic elastic 4096 Mar 17 07:44 /var/log/es7/
# Now restart the service, the startup can be successfully launched, and the port is detected
root@elk:~# su - elastic -c "/usr/local/elasticsearch-7.17.28/bin/elasticsearch"
root@elk:~# netstat -tunlp | egrep "9[2|3]00"
tcp6 0 0 :::9200 :::* LISTEN 2544/java
tcp6 0 0 :::9300 :::* LISTEN 2544/java
Access 9200 via browser
At the same time, elastic provides an API that can view the current number of hosts
[root@zabbix ~]# curl 192.168.121.21:9200/_cat/nodes
172.16.1.21 40 97 0 0.11 0.29 0.20 cdfhilmrstw * elk
# Access on the command line, since the current single node deployment es, there is only one node
# Before we start es is the front desk startup, there will be two problems.
1. Occupy terminal
2. It is difficult to end es, so we usually start it by running in the background.
The official way to run the background
-d parameter of elasticsearch
To run Elasticsearch as a daemon, specify -d on the command line, and record the process ID in a file using the -p option:
./bin/elasticsearch -d -p pid
root@elk:~# su - elastic -c '/usr/local/elasticsearch-7.17.28/bin/elasticsearch -d'
# Common errors
Q1: The maximum virtual memory map is too small
bootstrap check failure [1] of [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
ERROR: Elasticsearch did not exit normally - check the logs at /var/log/es7/
root@elk:~# sysctl -q vm.max_map_count
vm.max_map_count = 65530
root@elk:~# echo "vm.max_map_count = 262144" >> /etc//
root@elk:~# sysctl -w vm.max_map_count=262144
vm.max_map_count = 262144
root@elk:~# sysctl -q vm.max_map_count
vm.max_map_count = 262144
Q2: es configuration file is written incorrectly
: single-node
Q3: The word lock appears to indicate that an ES instance has been started. Kill the existing process and then restart the startup command
: failed to obtain node locks, tried [[/var/lib/es7]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
Q5: There is a problem with the deployment of ES cluster and the master role is missing.
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}
Uninstall the environment
1. Stop elasticsearch
root@elk:~# kill `ps -ef | grep java | grep -v grep |awk '{print $2}'`
root@elk:~# ps -ef | grep java
root 4437 1435 0 09:21 pts/2 00:00:00 grep --color=auto java
2. Delete data directory, log directory, installation package, user
root@elk:~# rm -rf /usr/local/elasticsearch-7.17.28/ /var/{lib,log}/es7/
root@elk:~# userdel -r elastic
Install ES single point based on deb package
1. Install the deb package
root@elk:~# wget /downloads/elasticsearch/elasticsearch-7.17.
2. Install es
root@elk:~# dpkg -i elasticsearch-7.17.
# Install es through binary packages can be managed using systemctl
### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd
sudo systemctl daemon-reload
sudo systemctl enable
### You can start elasticsearch service by executing
sudo systemctl start
Created elasticsearch keystore in /etc/elasticsearch/
3. Modify the es configuration file
root@elk:~# vim /etc/elasticsearch/
root@elk:~# egrep -v "^#|^$" /etc/elasticsearch/
: xu-es
: /var/lib/elasticsearch
: /var/log/elasticsearch
: 0.0.0.0
: single-node
4. Start es
systemctl enable elasticsearch --now
# Check the service file of es. The following parameters are all done by ourselves during binary installation.
User=elasticsearch
Group=elasticsearch
ExecStart=/usr/share/elasticsearch/bin/systemd-entrypoint -p ${PID_DIR}/ --quie
cat /usr/share/elasticsearch/bin/systemd-entrypoint
#!/bin/sh
# This wrapper script allows SystemD to feed a file containing a passphrase into
# the main Elasticsearch startup script
if [ -n "$ES_KEYSTORE_PASSPHRASE_FILE" ] ; then
exec /usr/share/elasticsearch/bin/elasticsearch "$@" < "$ES_KEYSTORE_PASSPHRASE_FILE"
else
exec /usr/share/elasticsearch/bin/elasticsearch "$@"
fi
Common Terms
1. Index Index
The unit for users to read and write data
2. Sharding Shared
An index must have at least one shard. If an index has only one shard, it means that the data of the index can only be stored in a certain node in full, and the shards are not split and belong to a certain node.
In other words, sharding is the smallest scheduling unit in the ES cluster.
An index data can also be stored on different shards in a dispersed manner, and these shards can be placed on different nodes to realize distributed storage of data.
3. Copy replica
Replicas are for shards, and a shard can have 0 or more replicas.
When the number of replicas is 0, it means that there is only the primary shard. When the node where the primary shard is located is down, the data will be inaccessible.
When the number of replicas is greater than 0, it means that there are both primary shards and replica shards:
The main shard is responsible for reading and writing data (read write, rw)
Replica shards are responsible for load balancing of data reads (read only,ro)
4. Document document
Refers to the data stored by the user. It contains metadata and source data.
Metadata:
Data used to describe the source data.
Source data:
Data actually stored by the user.
5. Allocation: allocation
It refers to the process of allocating different shards of the index (including primary shards and replica shards) to the entire cluster.
Check cluster status
#es provides api /_cat/health
root@elk:~# curl 127.1:9200/_cat/health
1742210504 11:21:44 xu-es green 1 1 3 3 0 0 0 0 - 100.0%
root@elk:~# curl 127.1:9200/_cat/health?v
epoch timestamp cluster status shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1742210512 11:21:52 xu-es green 1 1 3 3 0 0 0 0 - 100.0%
escrow environment deployment
1. Install es cluster service
root@elk1:~# wget /downloads/elasticsearch/elasticsearch-7.17.
root@elk1:~# dpkg -i elasticsearch-7.17.
root@elk2:~# dpkg -i elasticsearch-7.17.
root@elk3:~# dpkg -i elasticsearch-7.17.
2. Configuration es, three machines are the same configuration
# No configuration is required
[root@elk1 ~]# grep -E "^(cluster|path|network|discovery|http)" /etc/elasticsearch/
: es-cluster
: /var/lib/elasticsearch
: /var/log/elasticsearch
: 0.0.0.0
: 9200
discovery.seed_hosts: ["192.168.121.91", "192.168.121.92", "192.168.121.93"]
3. Start the service
systemctl enable elasticsearch --now
4. Test, the master node with *
root@elk:~# curl 127.1:9200/_cat/nodes
172.16.1.23 6 97 25 0.63 0.57 0.25 cdfhilmrstw - elk3
172.16.1.22 5 96 23 0.91 0.76 0.33 cdfhilmrstw - elk2
172.16.1.21 19 90 39 1.22 0.87 0.35 cdfhilmrstw * elk
root@elk:~# curl 127.1:9200/_cat/nodes?v
ip cpu load_1m load_5m load_15m master name
172.16.1.23 9 83 2 0.12 0.21 0.18 cdfhilmrstw - elk3
172.16.1.22 8 96 3 0.16 0.28 0.24 cdfhilmrstw - elk2
172.16.1.21 22 97 3 0.09 0.30 0.25 cdfhilmrstw * elk
# Cluster deployment failure No uuid cluster missing master
[root@elk3 ~]# curl http://192.168.121.92:9200/_cat/nodes?v
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}
[root@elk3 ~]# curl 192.168.121.91:9200
{
"name" : "elk91",
"cluster_name" : "es-cluster",
"cluster_uuid" : "_na_",
...
}
[root@elk3 ~]#
[root@elk3 ~]# curl 10.0.0.92:9200
{
"name" : "elk92",
"cluster_name" : "es-cluster",
"cluster_uuid" : "_na_",
...
}
[root@elk3 ~]#
[root@elk3 ~]#
[root@elk3 ~]# curl 10.0.0.93:9200
{
"name" : "elk93",
"cluster_name" : "es-cluster",
"cluster_uuid" : "_na_",
...
}
[root@elk3 ~]#
[root@elk3 ~]# curl http://192.168.121.91:9200/_cat/nodes
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}
# Solution
1. Stop the ES service of the cluster
[root@elk91 ~]# systemctl stop
[root@elk92 ~]# systemctl stop
[root@elk93 ~]# systemctl stop
2. Delete data, logs, and temporary data
[root@elk91 ~]# rm -rf /var/{lib,log}/elasticsearch/* /tmp/*
[root@elk92 ~]# rm -rf /var/{lib,log}/elasticsearch/* /tmp/*
[root@elk93 ~]# rm -rf /var/{lib,log}/elasticsearch/* /tmp/*
3. Add configuration items
[root@elk1 ~]# grep -E "^(cluster|path|network|discovery|http)" /etc/elasticsearch/
: es-cluster
: /var/lib/elasticsearch
: /var/log/elasticsearch
: 0.0.0.0
: 9200
discovery.seed_hosts: ["192.168.121.91", "192.168.121.92", "192.168.121.93"]
cluster.initial_master_nodes: ["192.168.121.91", "192.168.121.92", "192.168.121.93"] ######
4. Restart the service
5. Test
es cluster master election process
1. When starting, it will check whether the cluster has a master, and if so, no election master will be initiated;
1. When the start is started, all nodes are masters and send information to other nodes in the cluster (including ClusterStateVersion, ID, etc.)
2. Obtain a list of all nodes that can participate in master elections based on similar gossip protocols;
3. First compare "ClusterStateVersion". Whoever has the highest priority will be elected as a master;
4. If you can't compare it, compare the ID. Whoever has a smaller ID will be preferred to become a master;
5. When more than half of the nodes in the cluster participate in the election, the master election will be completed. For example, there are N nodes, and only the "(N/2)+1" node is needed to confirm the master;
After the election is completed, the cluster list will be notified of the latest master node, which means that the election is completed;
DSL
- ES compared to MySQL
- MySQL is a relational database:
Add, delete, modify and search: Based on SQL - ES is a document-based database and is very similar to MangoDB.
Add, delete, modify and search: DSL statements are a unique query language for ES.
For fuzzy queries, mysql cannot make full use of indexes and has low performance. Instead, it uses ES to query fuzzy data, which is very efficient.
- MySQL is a relational database:
Add a single request to es
Testing with postman
# In essence, curl is used
curl --location 'http://192.168.121.21:9200/test_linux/doc' \
--header 'Content-Type: application/json' \
--data '{
"name": "Sun Wukong",
"hobby": [
"Peach",
"Zixia Fairy"
]
}
curl --location '192.168.121.21:9200/_bulk' \
--header 'Content-Type: application/json' \
--data '{ "create" : { "_index" : "test_linux_ss", "_id" : "1001" } }
{ "name" : "Zhu Bajie","hobby": ["Monkey Brother","Gao Laozhuang"] }
{"create": {"_index":"test_linux_ss","_id":"1002"}}
{"name":"White Dragon Horse","hobby":["Carry Tang Monk","Eat Grass"]}
'
Query data
curl --location '192.168.121.22:9200/test_linux_ss/_doc/1001' \
--data ''
curl --location --request GET '192.168.121.22:9200/test_linux_ss/_search' \
--header 'Content-Type: application/json' \
--data '{
"query":{
"match":{
"name":"Zhu Bajie"
}
}
}'
Delete data
curl --location --request DELETE '192.168.121.22:9200/test_linux_ss/_doc/1001'
kibana
Deploy kibana
Kibana is a visualization tool for ES. Future operations can be completed in ES.
1. Download kibana
root@elk:~# wget /downloads/kibana/kibana-7.17.
2. Install kibana
root@elk:~# dpkg -i kibana-7.17.
3. Modify the configuration file
root@elk:~# vim /etc/kibana/
root@elk:~# grep -E "^(|i18n|server)" /etc/kibana/
: 5601
: "0.0.0.0"
: ["http://192.168.121.21:9200","http://192.168.121.22:9200","http://192.168.121.23:9200"]
: "zh-CN"
4. Start kibana
root@elk:~# systemctl enable --now
Synchronizing state of with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable kibana
Created symlink /etc/systemd/system// → /etc/systemd/system/.
root@elk:~# netstat -tunlp | grep 5601
tcp 0 0 0.0.0.0:5601 0.0.0.0:* LISTEN 19392/node
Web access testing
Based on KQL basic usage
Filter data
Filebeat
Deploy Filebeat
1. Download Filebeat
root@elk2:~# wget /downloads/beats/filebeat/filebeat-7.17.
2. Install Filebeat
root@elk2:~# dpkg -i filebeat-7.17.
3. Write Filebeat configuration file
# Filebeat requires us to create the config directory ourselves and then write the configuration file
mkdir /etc/filebeat/config
vim /etc/filebeat/config/
# Filebeat configuration file contains two parts, Input and Output, Input indicates where to collect data; Output indicates where to store the data and configure it according to the official document.
# There is currently no service running log, so the Input source is specified as a special file, and Output goes to the output terminal console
root@elk2:~# cat /etc/filebeat/config/
# Define where the data comes from
:
# The type of the data source is log, which means reading data from the file
- type: log
# Specify the path to the file
paths:
- /tmp/
# Define data to terminal
:
pretty: true
4. Run filebeat instance
filebeat -e -c /etc/filebeat/config/
5. Create a file and write data
root@elk2:~# echo ABC > /tmp/
// Output prompt
{
"@timestamp": "2025-03-18T14:48:42.432Z",
"@metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.17.28"
},
"message": "ABC", // The detected content changes
"input": {
"type": "log"
},
"ecs": {
"version": "1.12.0"
},
"host": {
"name": "elk2"
},
"agent": {
"type": "filebeat",
"version": "7.17.28",
"hostname": "elk2",
"ephemeral_id": "7f116862-382c-48f4-8797-c4b689e6e6fe",
"id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf",
"name": "elk2"
},
"log": {
"offset": 0, // Offset offset=0 means starting from 0
"file": {
"path": "/tmp/"
}
}
}
# Append write data to the file
root@elk2:~# echo 123 >> /tmp/
// Check filebeat output prompts
{
"@timestamp": "2025-03-18T14:51:17.449Z",
"@metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.17.28"
},
"log": {
"offset": 4, // The offset starts from 4
"file": {
"path": "/tmp/"
}
},
"message": "123", // Statistics 123
"input": {
"type": "log"
},
"ecs": {
"version": "1.12.0"
},
"host": {
"name": "elk2"
},
"agent": {
"id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf",
"name": "elk2",
"type": "filebeat",
"version": "7.17.28",
"hostname": "elk2",
"ephemeral_id": "7f116862-382c-48f4-8797-c4b689e6e6fe"
}
}
Filebeat Features
# Use the silent output of echo to append # At this time, filebeat cannot collect data
root@elk2:~# echo -n 456 >> /tmp/
root@elk2:~# cat /tmp/
ABC
123
456root@elk2:~#
root@elk2:~# echo -n abc >> /tmp/
root@elk2:~# cat /tmp/
ABC
123
456789abcroot@elk2:~#
# Use non-silent writing data at this time filebeat can collect data
root@elk2:~# echo haha >> /tmp/
root@elk2:~# cat /tmp/
ABC
123
456789abchaha
// View filebeat output information
{
"@timestamp": "2025-03-18T14:55:37.476Z",
"@metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.17.28"
},
"host": {
"name": "elk2"
},
"agent": {
"name": "elk2",
"type": "filebeat",
"version": "7.17.28",
"hostname": "elk2",
"ephemeral_id": "7f116862-382c-48f4-8797-c4b689e6e6fe",
"id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf"
},
"log": {
"offset": 8, // Offset=8
"file": {
"path": "/tmp/"
}
},
"message": "456789abchaha", // The collected data is all data with silent output + non-silent output
"input": {
"type": "log"
},
"ecs": {
"version": "1.12.0"
}
}
From this we can obtain the first property of Filebeat:
filebeat is to collect data by row by row by default;
# Now stop Filebeat and modify it. If you start data collection again, will Filebeat collect all the contents in the file, or will it only collect the newly added contents after Filebeat stops
root@elk2:~# echo xixi >> /tmp/
# Restart Filebeat
// View Filebeat output information
{
"@timestamp": "2025-03-18T15:00:51.759Z",
"@metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.17.28"
},
"ecs": {
"version": "1.12.0"
},
"host": {
"name": "elk2"
},
"agent": {
"type": "filebeat",
"version": "7.17.28",
"hostname": "elk2",
"ephemeral_id": "81db6575-7f98-4ca4-a86f-4d0127c1e2a4",
"id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf",
"name": "elk2"
},
"log": {
"offset": 22, // The offset does not start from 0 either
"file": {
"path": "/tmp/"
}
},
"message": "xixi", // Only collect new content after Filebeat is stopped
"input": {
"type": "log"
}
}
// There is a prompt above, when filebeat is started again, import the json file in /var/lib/filebeat/registry/filebeat directory
2025-03-18T15:00:51.756Z INFO memlog/:124 Finished loading transaction log file for '/var/lib/filebeat/registry/filebeat'. Active transaction id=5
// We will also import from the /var/lib/filebeat/registry/filebeat directory when we first start, but this directory will not exist when we start for the first time, so naturally there is no information.
// View the json file under /var/lib/filebeat/registry/filebeat, and record the offset value in this json file. This is why filebeat can restart and not start recording the file content from scratch after it is stopped.
{"op":"set","id":1}
{"k":"filebeat::logs::native::1441831-64768","v":{"id":"native::1441831-64768","prev_id":"","timestamp":[431172511,1742309322],"ttl":-1,"identifier_name":"native","source":"/tmp/","offset":0,"type":"log","FileStateOS":{"inode":1441831,"device":64768}}}
{"op":"set","id":2}
{"k":"filebeat::logs::native::1441831-64768","v":{"prev_id":"","source":"/tmp/","type":"log","FileStateOS":{"inode":1441831,"device":64768},"id":"native::1441831-64768","offset":4,"timestamp":[434614328,1742309323],"ttl":-1,"identifier_name":"native"}}
{"op":"set","id":3}
{"k":"filebeat::logs::native::1441831-64768","v":{"id":"native::1441831-64768","identifier_name":"native","ttl":-1,"type":"log","FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","source":"/tmp/","offset":8,"timestamp":[450912955,1742309478]}}
{"op":"set","id":4}
{"k":"filebeat::logs::native::1441831-64768","v":{"type":"log","identifier_name":"native","offset":22,"timestamp":[478003874,1742309738],"source":"/tmp/","ttl":-1,"FileStateOS":{"inode":1441831,"device":64768},"id":"native::1441831-64768","prev_id":""}}
{"op":"set","id":5}
{"k":"filebeat::logs::native::1441831-64768","v":{"id":"native::1441831-64768","ttl":-1,"FileStateOS":{"device":64768,"inode":1441831},"identifier_name":"native","prev_id":"","source":"/tmp/","offset":22,"timestamp":[478003874,1742309738],"type":"log"}}
{"op":"set","id":6}
{"k":"filebeat::logs::native::1441831-64768","v":{"offset":22,"timestamp":[759162512,1742310051],"type":"log","FileStateOS":{"device":64768,"inode":1441831},"id":"native::1441831-64768","prev_id":"","identifier_name":"native","source":"/tmp/","ttl":-1}}
{"op":"set","id":7}
{"k":"filebeat::logs::native::1441831-64768","v":{"offset":22,"timestamp":[759368397,1742310051],"type":"log","FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","source":"/tmp/","ttl":-1,"identifier_name":"native","id":"native::1441831-64768"}}
{"op":"set","id":8}
{"k":"filebeat::logs::native::1441831-64768","v":{"ttl":-1,"identifier_name":"native","id":"native::1441831-64768","source":"/tmp/","timestamp":[761513338,1742310052],"FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","offset":27,"type":"log"}}
{"op":"set","id":9}
{"k":"filebeat::logs::native::1441831-64768","v":{"source":"/tmp/","timestamp":[795028411,1742310356],"FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","offset":27,"ttl":-1,"type":"log","identifier_name":"native","id":"native::1441831-64768"}}
This is the second feature of Filebeat
filebeat will record the collected file offset information in the "/var/lib/filebeat" directory by default, so that data will continue to be collected at this location after the next acquisition;
Filebeat writes to es
Write Filebeat's Output to es
View configuration according to official documentation
The Elasticsearch output sends events directly to Elasticsearch using the Elasticsearch HTTP API.
Example configuration:
:
hosts: ["https://myEShost:9200"]
root@elk2:~# cat /etc/filebeat/config/
# Define where the data comes from
:
# The type of the data source is log, which means reading data from the file
- type: log
# Specify the path to the file
paths:
- /tmp/
# Define data to terminal
:
hosts:
- 192.168.121.21:9200
- 192.168.121.22:9200
- 192.168.121.23:9200
# Delete filebeat's json file
root@elk2:~# rm -rf /var/lib/filebeat
# Start filebeat instance
root@elk2:~# filebeat -e -c /etc/filebeat/config/
Data collected in kibana
View collected data
Set refresh frequency
Custom index
# We can define the index name yourself. The official definition method is given to set the index parameter
:
hosts: ["http://localhost:9200"]
index: "%{[fields.log_type]}-%{[]}-%{+}"
root@elk2:~# cat /etc/filebeat/config/
# Define where the data comes from
:
# The type of the data source is log, which means reading data from the file
- type: log
# Specify the path to the file
paths:
- /tmp/
# Define data to terminal
:
hosts:
- 192.168.121.21:9200
- 192.168.121.22:9200
- 192.168.121.23:9200
# Custom index name
index: "test_filebeat-%{+}"
# Start filebeat, an error will be reported at this time
root@elk2:~# filebeat -e -c /etc/filebeat/config/
2025-03-19T02:55:18.951Z INFO instance/:698 Home path: [/usr/share/filebeat] Config path: [/etc/filebeat] Data path: [/var/lib/filebeat] Logs path: [/var/log/filebeat] Hostfs Path: [/]
2025-03-19T02:55:18.958Z INFO instance/:706 Beat ID: a109c2d1-fbb6-4b82-9416-29f9488ccabc
# You must set these two parameters and, that is, if we want to customize the index name, we must set and
2025-03-19T02:55:18.958Z ERROR instance/:1027 Exiting: and have to be set if index name is modified
Exiting: and have to be set if index name is modified
# and gave prompts on the official website,
If you change this setting, you also need to configure the and options (see Elasticsearch index template).
# Official example
The name of the template. The default is filebeat. The Filebeat version is always appended to the given name, so the final name is filebeat-%{[]}.
The template pattern to apply to the default index settings. The default pattern is filebeat. The Filebeat version is always included in the pattern, so the final pattern is filebeat-%{[]}.
Example:
: "filebeat"
: "filebeat"
# You also need to set the default settings given by shared and replicas
:
index.number_of_shards: 1
index.number_of_replicas: 1
# Configure our own index template (that is, the rules for creating indexes)
root@elk2:~# cat /etc/filebeat/config/
# Define where the data comes from
:
# The type of the data source is log, which means reading data from the file
- type: log
# Specify the path to the file
paths:
- /tmp/
# Define data to terminal
:
hosts:
- 192.168.121.21:9200
- 192.168.121.22:9200
- 192.168.121.23:9200
# Custom index name
index: "test_filebeat-%{+}"
# Define the name of the index template (that is, the rule for creating the index)
: "test_filebeat"
# Define the matching pattern of the index template to indicate which indexes the current index template takes effect.
: "test_filebeat-*"
# # Define the rule information for index templates
:
# Number of shards
index.number_of_shards: 3
# How many copies are there for each shard
index.number_of_replicas: 0
# Start filebeat You can start filebeat normally at this time, but in kibana, we found that we did not set up the index, check the startup information
root@elk2:~# filebeat -e -c /etc/filebeat/config/
# The rough meaning here is that ILM is set to auto. If this configuration is enabled, all information about the custom index is ignored. So we need to set ILM to false
2025-03-19T03:10:02.548Z INFO [index-management] idxmgmt/:260 Auto ILM enable success.
2025-03-19T03:10:02.558Z INFO [] ilm/:170 ILM policy filebeat exists already.
2025-03-19T03:10:02.559Z INFO [index-management] idxmgmt/:396 Set to '{filebeat-7.17.28 {now/d}-000001}' as ILM is enabled.
# Check the official website for index lifecycle management ILM configuration
When index lifecycle management (ILM) is enabled, the default index is "filebeat-%{[]}-%{+}-%{index_num}", for example, "filebeat-8.17.3-2025-03-17-000001". Custom index settings are ignored when ILM is enabled. If you’re sending events to a cluster that supports index lifecycle management, see Index lifecycle management (ILM) to learn how to change the index name.
# ilm is auto mode by default, supports true, false, and auto
Enables or disables index lifecycle management on any new indices created by Filebeat. Valid values are true, false, and auto. When auto (the default) is specified on version 7.0 and later
: auto
# Add ilm configuration to our own configuration file
# Start filebeat
root@elk2:~# filebeat -e -c /etc/filebeat/config/
The index template has been created
# At this time I want to modify my shared and replicas
# Modify the configuration file directly and change to 5shared 0repicas
It's still 3shared 0replicas
# This is because the default parameter is false, which means it does not overwrite
A boolean that specifies whether to overwrite the existing template. The default is false. Do not enable this option if you start more than one instance of Filebeat at the same time. It can overload Elasticsearch by sending too many template update requests.
# Set to true
root@elk2:~# cat /etc/filebeat/config/
# Define where the data comes from
:
# The type of the data source is log, which means reading data from the file
- type: log
# Specify the path to the file
paths:
- /tmp/
# Define data to terminal
:
hosts:
- 192.168.121.21:9200
- 192.168.121.22:9200
- 192.168.121.23:9200
# Custom index name
index: "test_filebeat-%{+}"
# Disable index lifecycle management (ILM)
# If this configuration is enabled, all information about the custom index is ignored
: false
# If the index template exists, whether to overwrite it, the default value is false, and if it is explicitly needed, it can be set to ture.
# However, the official suggests setting it to false, because every time you write data, you will create a tcp link, consuming resources.
: true
# Define the name of the index template (that is, the rule for creating the index)
: "test_filebeat"
# Define the matching pattern of the index template to indicate which indexes the current index template takes effect.
: "test_filebeat-*"
# # Define the rule information for index templates
:
# Number of shards
index.number_of_shards: 5
# How many copies are there for each shard
index.number_of_replicas: 0
# Start filebeat
# At this time, the shared and replicas have been modified.
Filebeat collection nginx practice
1. Install nginx
root@elk2:~# apt install -y nginx
2. Start nginx
root@elk2:~# systemctl start nginx
root@elk2:~# netstat -tunlp | grep 80
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 17956/nginx: master
tcp6 0 0 :::80 :::* LISTEN 17956/nginx: master
3. Test access
root@elk2:~# curl 127.1
#Log Location
root@elk2:~# ll /var/log/nginx/
-rw-r----- 1 www-data adm 86 Mar 19 06:58 /var/log/nginx/
root@elk2:~# cat /var/log/nginx/
127.0.0.1 - - [19/Mar/2025:06:58:31 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0"
4. Write Filebeat instance
root@elk2:~# cat /etc/filebeat/config/
# Define where the data comes from
:
# The type of the data source is log, which means reading data from the file
- type: log
# Specify the path to the file
paths:
- /var/log/nginx/*
# Define data to terminal
:
hosts:
- 192.168.121.21:9200
- 192.168.121.22:9200
- 192.168.121.23:9200
# Custom index name
index: "test_filebeat-%{+}"
# Disable index lifecycle management (ILM)
# If this configuration is enabled, all information about the custom index is ignored
: false
# If the index template exists, whether to overwrite it, the default value is false, and if it is explicitly needed, it can be set to ture.
# However, the official suggests setting it to false, because every time you write data, you will create a tcp link, consuming resources.
: true
# Define the name of the index template (that is, the rule for creating the index)
: "test_filebeat"
# Define the matching pattern of the index template to indicate which indexes the current index template takes effect.
: "test_filebeat-*"
# # Define the rule information for index templates
:
# Number of shards
index.number_of_shards: 5
# How many copies are there for each shard
index.number_of_replicas: 0
5. Start filebeat
root@elk2:~# filebeat -e -c /etc/filebeat/config/
Filebeat analysis nginx log
filebeat modules
# filebeat support any modules
# Official explanation of filebeat module: It can simplify the log format of filebeat analysis
# Filebeat modules simplify the collection, parsing, and visualization of common log formats.
# By default, these modules are disabled. When needed, we need to set them to enable ourselves.
root@elk2:~# ls -l /etc/filebeat//
Total 300
-rw-r--r-- 1 root root 484 Feb 13 16:58
-rw-r--r-- 1 root root 476 Feb 13 16:58
-rw-r--r-- 1 root root 281 Feb 13 16:58
-rw-r--r-- 1 root root 2112 Feb 13 16:58
. . .
root@elk2:~# ls -l /etc/filebeat// | wc -l
72
# Check which modules are enabled and disabled
root@elk2:~# filebeat modules list
# Start the module
root@elk2:~# filebeat modules enable apache nginx mysql redis
Enabled apache
Enabled nginx
Enabled mysql
Enabled redis
# Stop the module
root@elk2:~# filebeat modules disable apache mysql redis
Disabled apache
Disabled mysql
Disabled redis
Configure filebeat monitoring nginx
# You need to configure module functions in filebeat configuration file. The configuration method is specified in /etc/filebeat/file
:
# Glob pattern for configuration loading
path: ${}//*.yml
# Set to true to enable config reloading
: false
# Period on which files under path should be checked for changes
#: 10s
module_nginx# Write filebeat instance
root@elk2:~# cat /etc/filebeat/config/
# Config modules
:
# Glob pattern for configuration loading Specify the path to load
path: ${}//*.yml
# Set to true to enable config reloading Automatically load the yml file below /etc/filebeat//.
: true
# Period on which files under path should be checked for changes
#: 10s
:
hosts:
- 192.168.121.21:9200
- 192.168.121.22:9200
- 192.168.121.23:9200
index: "module_nginx-%{+}"
: false
: true
: "module_nginx"
: "module_nginx-*"
:
index.number_of_shards: 5
index.number_of_replicas: 0
# Prepare nginx log test cases
root@elk2:~# cat /var/log/nginx/
192.168.121.1 - - [19/Mar/2025:16:42:23 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36 CrKey/1.54.248666"
1.168.121.1 - - [19/Mar/2025:16:42:26 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36 CrKey/1.54.248666"
92.168.121.1 - - [19/Mar/2025:16:42:29 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1"
192.168.11.1 - - [19/Mar/2025:16:42:31 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android 13; SM-G981B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Mobile Safari/537.36"
192.168.121.1 - - [19/Mar/2025:16:42:40 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Safari/605.1.15"
# Start filebeat instance
root@elk2:~# filebeat -e -c /etc/filebeat/config/
# Include the correct logs and wrong logs in the collected results. If we want to collect only the correct ones, we need to set the configuration file of the nginx template /etc/filebeat//
- module: nginx
# Access logs
access:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
: ["/var/log/nginx/"]
# Error logs
error:
enabled: false # Change from true to false
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
#:
# Ingress-nginx controller logs. This is disabled by default. It could be used in Kubernetes environments to parse ingress-nginx logs
ingress_controller:
enabled: false
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
#:
Kibana analysis PV
pv: page view page visits
1 request is a pv
Kibana Analytics IP
Kibana analyzes bandwidth
Kibana makes Dashboard
Kibana analysis equipment
Kibana analyzes the operating system proportion
Kibana analyzes the proportion of global users
filebeat collection tomcat logs
Deploy tomcat
[root@elk2 ~]# wget /tomcat/tomcat-11/v11.0.5/bin/apache-tomcat-11.0.
[root@elk2 ~]# tar xf apache-tomcat-11.0. -C /usr/local
# Configure environment variables
# es itself has a jdk environment. We configure es' jdk environment into environment variables and let tomcat call es' jdk environment
# es' jdk environment directory
[root@elk2 ~]# ll /usr/share/elasticsearch/jdk/
# Add environment variables
[root@elk2 ~]# vim /etc//
[root@elk2 ~]# source /etc//
[root@elk2 ~]# cat /etc//
#!/bin/bash
export JAVA_HOME=/usr/share/elasticsearch/jdk
export TOMCAT_HOME=/usr/local/apache-tomcat-11.0.5
export PATH=$PATH:$JAVA_HOME/bin:$TOMCAT_HOME/bin
[root@elk3 ~]# java -version
openjdk version "22.0.2" 2024-07-16
OpenJDK Runtime Environment (build 22.0.2+9-70)
OpenJDK 64-Bit Server VM (build 22.0.2+9-70, mixed mode, sharing)
# Because the default log format of tomcat shows very little information, we need to modify the tomcat configuration file and modify the log format.
[root@elk3 ~]# vim /usr/local/apache-tomcat-11.0.5/conf/
...
<Host name="" appBase="webapps"
unpackWARs="true" autoDeploy="true">
<Valve className="" directory="logs"
prefix=".com_access_log" suffix=".json"
pattern="{"clientip":"%h","ClientUser":"%l","authenticated":"%u","AccessTime":"%t","request":"%r","s tattoos":%s","SendBytes":"%b","Query?string":"%q","partner":"%{Referer}i","http_user_agent":"%{User-Agent}i"}"/>
</Host>
# Start tomcat
[root@elk2 ~]# start
[root@elk2 ~]# netstat -tunlp | grep 8080
tcp6 0 0 :::8080 :::* LISTEN 98628/java
# Access Test
[root@elk2 ~]# cat /etc/hosts
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
192.168.121.92
[root@elk2 ~]# cat /usr/local/apache-tomcat-11.0.5/logs/.com_access_log.
{"clientip":"192.168.121.92","ClientUser":"-","authenticated":"-","AccessTime":"[23/Mar/2025:20:55:41 +0800]","request":"GET / HTTP/1.1","status":"200","SendBytes":"11235","Query?string":"","partner":"-","http_user_agent":"curl/7.81.0"}
Configure filebeat monitoring tomcat
# Start the tomcat module
[root@elk3 ~]# filebeat modules enable tomcat
Enabled tomcat
[root@elk3 ~]# ll /etc/filebeat//
-rw-r--r-- 1 root root 623 Feb 14 00:58 /etc/filebeat//
# Configure the tomcat module
[root@elk3 ~]# cat /etc/filebeat//
# Module: tomcat
# Docs: /guide/en/beats/filebeat/7.17/
- module: tomcat
log:
enabled: true
# Set which input to use between udp (default), tcp or file.
#: udp
: file
# var.syslog_host:
# var.syslog_port: 8080
# Set paths for the log files when file input is used.
#:
# - /var/log/tomcat/*.log
:
- /usr/local/apache-tomcat-11.0.5/logs/.com_access_log.
# Toggle output of non-ECS fields (default true).
# var.rsa_fields: true
# Set custom timezone offset.
# "local" (default) for system timezone.
# "+02:00" for GMT+02:00
# var.tz_offset: local
# Configure filebeat
[root@elk3 ~]# cat /etc/filebeat/config/
:
path: ${}//*.yml
: true
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: test-modules-tomcat-%{+}
: false
: "test-modules-tomcat"
: "test-modules-tomcat-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
# Start filebeat
[root@elk3 ~]# filebeat -e -c /etc/filebeat/config/
filebeat processors
filebeat processor
/guide/en/beats/filebeat/7.17/
# When we use filebeat to collect tomcat logs, since tomcat logs are in json format under our settings, we want to obtain specific information in json format, and we need to further configure them through filebeat processors
# decode JSON fields parameter can implement parsing json format
# Official configuration
processors:
- decode_json_fields:
fields: ["field1", "field2", ...]
process_array: false
max_depth: 1
target: ""
overwrite_keys: false
add_error_key: true
# fields: Specify which field to perform json parsing
# process_array: a bool value that specifies whether to parse numbers. The default is false, optional configuration
# max_depth: Maximum parsing depth, default value is 1 Decodes the JSON object fields in the fields shown in the fields shown in the decode, and a value of 2 also decodes the objects embedded in the fields of these parsed documents, optional configuration
# target: The field to which the decoded JSON will be written. By default, the decoded JSON object will replace the string field that reads it. To merge the decoded JSON fields into the root of the event, specify target an empty string ( target: ""). Note that the null value ( target: ) is considered as unset fields optionally configured
# overwrite_keys: Boolean, specifying whether the existing key in the event is overwritten by the key in the decoded JSON object. The default value is false. Optional configuration
# add_error_key: If set to true and an error occurs when decoding the JSON key, the error field will become part of the event with an error message. If set to false, there will be no errors in the field of the event. The default value is false. Optional configuration
# Write filebeat configuration
[root@elk3 ~]# cat /etc/filebeat/config/,
cat: /etc/filebeat/config/,: No such file or directory
[root@elk3 ~]# cat /etc/filebeat/config/
:
path: ${}//*.yml
: true
processors:
- decode_json_fields:
fields: [""]
process_array: false
max_depth: 1
target: ""
overwrite_keys: false
add_error_key: true
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: test-modules-tomcat-%{+}
: false
: "test-modules-tomcat"
: "test-modules-tomcat-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
# Start filebeat
[root@elk3 ~]# filebeat -e -c /etc/filebeat/config/
# Delete a field through filebeat
processors:
- drop_fields:
When:
condition
fields: ["field1", "field2", ...]
ignore_missing: false
The supported conditions are:
equals
contains
regexp
range
network
has_fields
or
and
Not
# Delete the field value of status is 404
[root@elk3 ~]# cat /etc/filebeat/config/
:
path: ${}//*.yml
: true
processors:
- decode_json_fields:
fields: [""]
process_array: false
max_depth: 1
target: ""
overwrite_keys: false
add_error_key: true
- drop_fields:
When:
equals:
status: "404"
fields: [""]
ignore_missing: false
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: test-modules-tomcat-%{+}
: false
: "test-modules-tomcat"
: "test-modules-tomcat-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
[root@elk3 ~]# filebeat -e -c /etc/filebeat/config/
filebeat collection es cluster log
# Start the module
[root@elk1 ~]# filebeat modules enable elasticsearch
Enabled elasticsearch
[root@elk1 ~]# cat /etc/filebeat/config/
:
path: ${}//*.yml
: true
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: es-log-modules-eslog-%{+}
: false
: "es-log"
: "es-log-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
filebeat collects mysql logs
# Deploy mysql
[root@elk1 ~]# wget /get/Downloads/MySQL-8.4/mysql-8.4.4-linux-glibc2.28-x86_64.
[root@elk1 ~]# tar xf mysql-8.4.4-linux-glibc2.28-x86_64. -C /usr/local/
# Prepare to start the script and authorize it
[root@elk1 ~]# cp /usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/support-files/ /etc//
[root@elk1 ~]# vim /etc//
[root@elk1 ~]# grep -E "^(basedir=|datadir=)" /etc//
basedir=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/
datadir=/var/lib/mysql
[root@elk1 ~]# useradd -m mysql
[root@elk1 ~]# install -d /var/lib/mysql -o mysql -g mysql
[root@elk1 ~]# ll -d /var/lib/mysql
drwxr-xr-x 2 mysql mysql 4096 Mar 25 17:05 /var/lib/mysql/
# Prepare the configuration file
[root@elk1 ~]# vim /etc/
[root@elk1 ~]# cat /etc/
[mysqld]
basedir=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/
datadir=/var/lib/mysql
socket=/tmp/
port=3306
[client]
socket=/tmp/
# Start the database
[root@elk1 ~]# vim /etc//
[root@elk1 ~]# cat /etc//
#!/bin/bash
export MYSQL_HOME=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/
export PATH=$PATH:$MYSQL_HOME/bin
[root@elk1 ~]# source /etc//
[root@elk1 ~]# mysqld --initialize-insecure --user=mysql --datadir=/var/lib/mysql --basedir=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64
2025-03-25T09:08:36.829914Z 0 [System] [MY-015017] [Server] MySQL Server Initialization - start.
2025-03-25T09:08:36.842773Z 0 [System] [MY-013169] [Server] /usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/bin/mysqld (mysqld 8.4.4) initializing of server in progress as process 7905
2025-03-25T09:08:36.918780Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2025-03-25T09:08:37.818933Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2025-03-25T09:08:42.504501Z 6 [Warning] [MY-010453] [Server] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.
2025-03-25T09:08:46.909940Z 0 [System] [MY-015018] [Server] MySQL Server Initialization - end.
[root@elk1 ~]# /etc// start
Starting (via systemctl): .
[root@elk1 ~]# netstat -tunlp | grep 3306
tcp6 0 0 :::3306 :::* LISTEN 8141/mysqld
tcp6 0 0 :::33060 :::* LISTEN 8141/mysqld
# Enable filebeat module
[root@elk1 ~]# filebeat modules enable mysql
Enabled mysql
# Configure filebeat
[root@elk1 ~]# cat /etc/filebeat/config/
:
path: ${}//
: true
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: es-modules-mysql-%{+}
: false
: "es-modules-mysql"
: "es-modules-mysql-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
# Configure mysql modules
[root@elk1 ~]# cat /etc/filebeat//
# Module: mysql
# Docs: /guide/en/beats/filebeat/7.17/
- module: mysql
# Error logs
error:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
#:
: ["/var/lib/mysql/"]
# Slow logs
slowlog:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
#:
# Start filebeat instance
[root@elk1 ~]# filebeat -e -c /etc/filebeat/config/
filebeat collection redis
# Install redis
[root@elk1 ~]# apt install -y redis
# redis log file location
[root@elk1 ~]# cat /var/log/redis/
8618:C 25 Mar 2025 17:18:37.442 # WARNING supervised by systemd - you MUST set appropriate values for TimeoutStartSec and TimeoutStopSec in your service unit.
8618:C 25 Mar 2025 17:18:37.442 # oO0OoO0OoO0OoOo Redis is starting oO0OoO0Oo0Oo0Oo
8618:C 25 Mar 2025 17:18:37.442 # Redis version=6.0.16, bits=64, commit=00000000, modified=0, pid=8618, just started
8618:C 25 Mar 2025 17:18:37.442 # Configuration loaded
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 6.0.16 (00000000/0) 64 bit
.-`` .-````. ```\/ _.,_ ''-.
( ' , .-` | `, ) Running in standalone mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
| `-._ `._ / _.-' | PID: 8618
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
8618:M 25 Mar 2025 17:18:37.446 # Server initialized
8618:M 25 Mar 2025 17:18:37.446 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/ and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
8618:M 25 Mar 2025 17:18:37.447 * Ready to accept connections
# Start redis modules
[root@elk1 ~]# filebeat modules enable redis
Enabled redis
[root@elk1 ~]# cat /etc/filebeat/config/
:
path: ${}//
: true
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: es-modules-redis-%{+}
: false
: "es-modules-redis"
: "es-modules-redis-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
# Start filebeat instance
[root@elk1 ~]# filebeat -e -c /etc/filebeat/config/
Filebeat multi-line merging issue
# Manage multiline messages
#
parsers:
- multiline:
type: pattern
pattern: '^\['
negate: true
match: after
Defines which aggregation method to use. The default is pattern. The other option is count which lets you aggregate constant number of lines.
Specifies the regular expression pattern to match. Note that the regexp patterns supported by Filebeat differ somewhat from the patterns supported by Logstash. See Regular expression support for a list of supported regexp patterns. Depending on how you configure other multiline options, lines that match the specified regular expression are considered either continuations of a previous line or the start of a new multiline event. You can set the negate option to negate the pattern.
Defines whether the pattern is negated. The default is false.
Specifies how Filebeat combines matching lines into an event. The settings are after or before. The behavior of these settings depends on what you specify for negate:
manager multiline redis log message
# Optimize redis cluster log collection rules through manager multiline message
# type: filestream is an alternative to old version of logs
[root@elk1 ~]# cat /etc/filebeat/config/
:
- type: filestream
paths:
- /var/log/redis/*
# Configure the parser
parsers:
# Define multi-line matching
- multiline:
# Specify the matching type
type: pattern
# Define the matching pattern
pattern: '^\d'
# Reference official website: /guide/en/beats/filebeat/current/
negate: true
match: after
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: es-modules-redis-%{+}
: false
: "es-modules-redis"
: "es-modules-redis-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
[root@elk1 ~]# filebeat -e -c /etc/filebeat/config/
#The pattern in the redis log is filtered out
manager multiline tomcat error log message
# tomcat error log path : /usr/local/apache-tomcat-11.0.5/logs/catalina.*
[root@elk2 ~]# cat /etc/filebeat/config/
:
- type: filestream
paths:
- /usr/local/apache-tomcat-11.0.5/logs/catalina*
parsers:
- multiline:
type: pattern
pattern: '^\d'
negate: true
match: after
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: test-modules-tomcat-elk2-%{+}
: false
: "test-modules-tomcat-elk2"
: "test-modules-tomcat-elk2*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
filebeat multiple instances
1. Start Example 1
filebeat -e -c /etc/filebeat/config/ -- /tmp/xixi
2. Start Example 2
filebeat -e -c /etc/filebeat/config/ -- /tmp/haha
#Collection /var/log/syslog /var/log/
root@elk2:~# cat /etc/filebeat/config/
# Define where the data comes from
:
# The type of the data source is log, which means reading data from the file
- type: log
# Specify the path to the file
paths:
- /var/log/syslog
# Define data to terminal
:
hosts:
- 192.168.121.21:9200
- 192.168.121.22:9200
- 192.168.121.23:9200
# Custom index name
index: "test_syslog-%{+}"
# Disable index lifecycle management (ILM)
# If this configuration is enabled, all information about the custom index is ignored
: false
# If the index template exists, whether to overwrite it, the default value is false, and if it is explicitly needed, it can be set to ture.
# However, the official suggests setting it to false, because every time you write data, you will create a tcp link, consuming resources.
: true
# Define the name of the index template (that is, the rule for creating the index)
: "test_syslog"
# Define the matching pattern of the index template to indicate which indexes the current index template takes effect.
: "test_syslog-*"
# # Define the rule information for index templates
:
# Number of shards
index.number_of_shards: 5
# How many copies are there for each shard
index.number_of_replicas: 0
root@elk2:~# cat /etc/filebeat/config/
# Define where the data comes from
:
# The type of the data source is log, which means reading data from the file
- type: log
# Specify the path to the file
paths:
- /var/log/
# Define data to terminal
:
hosts:
- 192.168.121.21:9200
- 192.168.121.22:9200
- 192.168.121.23:9200
# Custom index name
index: "test_auth-%{+}"
# Disable index lifecycle management (ILM)
# If this configuration is enabled, all information about the custom index is ignored
: false
# If the index template exists, whether to overwrite it, the default value is false, and if it is explicitly needed, it can be set to ture.
# However, the official suggests setting it to false, because every time you write data, you will create a tcp link, consuming resources.
: true
# Define the name of the index template (that is, the rule for creating the index)
: "test_auth"
# Define the matching pattern of the index template to indicate which indexes the current index template takes effect.
: "test_auth-*"
# # Define the rule information for index templates
:
# Number of shards
index.number_of_shards: 5
# How many copies are there for each shard
index.number_of_replicas: 0
# Start filebeat through multiple instances
root@elk2:~# filebeat -e -c /etc/filebeat/config/ -- /tmp/xixi
root@elk2:~# filebeat -e -c /etc/filebeat/config/ -- /tmp/haha
EFK analysis web cluster
Deploy web clusters
1. Deploy the tomcat server
# 192.168.121.92 192.168.121.93 Deploy tomcat
Refer to filebeat to collect tomcat section to deploy tomcat
2. Deploy nginx
# 192.168.121.91 Deploy nginx
[root@elk1 ~]# apt install -y nginx
[root@elk1 ~]# vim /etc/nginx/
...
upstream es-web{
server 192.168.121.92:8080;
server 192.168.121.93:8080;
}
server {
server_name ;
location / {
proxy_pass http://es-web;
}
}
...
[root@elk1 ~]# nginx -t
[root@elk1 ~]# systemctl restart nginx
# Access Test
[root@elk1 ~]# curl
Collect web cluster logs
#91 Load nginx module
# 92 93 Loading the tomcat module
[root@elk1 ~]# filebeat modules enable nginx
Enabled nginx
[root@elk2 ~]# filebeat modules enable tomcat
Enabled tomcat
[root@elk3 ~]# filebeat modules enable tomcat
Enabled tomcat
1. Configure nginx module functions
[root@elk1 ~]# cat /etc/filebeat//
# Module: nginx
# Docs: /guide/en/beats/filebeat/7.17/
- module: nginx
# Access logs
access:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
: /var/log/nginx/
# Error logs
error:
enabled: false
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
#:
# Ingress-nginx controller logs. This is disabled by default. It could be used in Kubernetes environments to parse ingress-nginx logs
ingress_controller:
enabled: false
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
#:
2. Configure the tomcat module function
[root@elk2 ~]# cat /etc/filebeat//
# Module: tomcat
# Docs: /guide/en/beats/filebeat/7.17/
- module: tomcat
log:
enabled: true
# Set which input to use between udp (default), tcp or file.
: file
# var.syslog_host: localhost
# var.syslog_port: 9501
# Set paths for the log files when file input is used.
:
- /usr/local/apache-tomcat-11.0.5/logs/*.json
# Toggle output of non-ECS fields (default true).
# var.rsa_fields: true
# Set custom timezone offset.
# "local" (default) for system timezone.
# "+02:00" for GMT+02:00
# var.tz_offset: local
3. Configure the 91filebeat configuration file
[root@elk1 ~]# cat /etc/filebeat/config/
:
# Glob pattern for configuration loading
path: ${}//*.yml
# Set to true to enable config reloading
: true
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: es-web-nginx-%{+}
: false
: "es-web-nginx"
: "es-web-nginx-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
4.92 Configuring monitoring tomcat configuration file
[root@elk2 ~]# cat /etc/filebeat/config/
:
path: ${}//*.yml
: true
processors:
- decode_json_fields:
fields: [""]
process_array: false
max_depth: 1
target: ""
overwrite_keys: false
add_error_key: true
- drop_fields:
When:
equals:
status: "404"
fields: [""]
ignore_missing: false
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: test-modules-tomcat91-%{+}
: false
: "test-modules-tomcat91"
: "test-modules-tomcat91-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
5.93 Configuration monitoring tomcat configuration file
[root@elk3 ~]# cat /etc/filebeat/config/
:
path: ${}//*.yml
: true
processors:
- decode_json_fields:
fields: [""]
process_array: false
max_depth: 1
target: ""
overwrite_keys: false
add_error_key: true
- drop_fields:
When:
equals:
status: "404"
fields: [""]
ignore_missing: false
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: test-modules-tomcat93-%{+}
: false
: "test-modules-tomcat93"
: "test-modules-tomcat93-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
# Start filebeat
Change field type
We want to count the bandwidth, but it cannot be counted at this time
This is a character type value
# Modify field types through processes
# Official website configuration
# Supported types
# The supported types include: integer, long, float, double, string, boolean, and ip.
processors:
- convert:
fields:
- {from: "src_ip", to: "", type: "ip"}
- {from: "src_port", to: "", type: "integer"}
ignore_missing: true
fail_on_error: false
# Configure filebeat configuration file
:
path: ${}//*.yml
: true
processors:
- decode_json_fields:
fields: [""]
process_array: false
max_depth: 1
target: ""
overwrite_keys: false
add_error_key: true
- convert:
fields:
- {from: "SendBytes", type: "long"}
- drop_fields:
When:
equals:
status: "404"
fields: [""]
ignore_missing: false
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
index: test-modules-tomcat91-%{+}
: false
: "test-modules-tomcat91"
: "test-modules-tomcat91-*"
: true
:
index.number_of_shards: 5
index.number_of_replicas: 0
Ansible deploys EFK cluster
[root@ansible efk]# cat set_es.sh
#!/bin/bash
ansible-playbook
ansible-playbook
ansible-playbook
ansible-playbook
ansible-playbook
[root@ansible efk]# bash set_es.sh
[root@ansible efk]# cat
---
- name: Install es cluster
hosts: all
tasks:
- name: get es deb package
get_url:
url: /downloads/elasticsearch/elasticsearch-7.17.
dest: /root/
- name: Install es
shell:
cmd: dpkg -i /root/elasticsearch-7.17. | cat
- name: Configer es
copy:
src: conf/
dest: /etc/elasticsearch/
- name: start es
systemd:
name: elasticsearch
state: started
enabled: yes
[root@ansible efk]# cat
---
- name: Install kibana
hosts: elk1
tasks:
- name: Get kibana deb package
get_url:
url: /downloads/kibana/kibana-7.17.
dest: /root
- name: Install kibana
shell:
cmd: dpkg -i kibana-7.17. | cat
- name: Config kibana
copy:
src: conf/
dest: /etc/kibana/
- name: Start kibana
systemd:
name: kibana
state: started
enabled: yes
[root@ansible efk]# cat
---
- name: Install filebeat
hosts: elk
tasks:
- name: Get filebeat code
get_url:
url: /downloads/beats/filebeat/filebeat-7.17.
dest: /root
- name: Install filebeat
shell:
cmd: dpkg -i filebeat-7.17. | cat
- name: Configer filebeat
file:
path: /etc/filebeat/config
state: directory
[root@ansible efk]# cat
---
- name: Set nginx
hosts: elk1
tasks:
- name: Install nginx
shell:
cmd: apt install -y nginx | cat
- name: config nginx
copy:
src: conf/
dest: /etc/nginx/
- name: start nginx
systemd:
name: nginx
state: started
enabled: yes
- name: Configure hosts
copy:
content: 192.168.121.91
dest: /etc/hosts
- name: Set tomcat
hosts: elk2,elk3
tasks:
- name: Get tomcat code
get_url:
url: /tomcat/tomcat-11/v11.0.5/bin/apache-tomcat-11.0.
dest: /root/
- name: unarchive tomcat code
unarchive:
src: /root/apache-tomcat-11.0.
dest: /usr/local
remote_src: yes
- name: Configure jdk PATH
copy:
src: conf/
dest: /etc//
- name: reload profile
shell:
cmd: source /etc// | cat
- name: Configure tomcat
copy:
src: conf/
dest: /usr/local/apache-tomcat-11.0.5/conf/
- name: start tomcat
shell:
cmd: start |cat
[root@ansible efk]# cat
---
- name: configure filebeat
hosts: elk1
tasks:
- name: enable nginx modules
shell:
cmd: filebeat modules enable nginx | cat
- name: configure nginx modules
copy:
src: conf/
dest: /etc/filebeat//
- name: configure filebeat
copy:
src: conf/
dest: /etc/filebeat/config/
- name: configure filebeat
hosts: elk2 elk3
tasks:
- name: enable tomcat modules
shell:
cmd: filebeat modules enable nginx | cat
- name: configure tomcat modules
copy:
src: conf/
dest: /etc/filebeat//
- name: configure filebeat
template:
src: conf/.j2
dest: /etc/filebeat/config/
logstash
Install and configure logstash
1. Deploy logstash
[root@elk3 ~]# wget /downloads/logstash/logstash-7.17.
[root@elk3 ~]# dpkg -i logstash-7.17.
2. Create a symbolic link and add the Logstash command to the PATH environment variable
[root@elk3 ~]# ln -svf /usr/share/logstash/bin/logstash /usr/local/bin/
'/usr/local/bin/logstash' -> '/usr/share/logstash/bin/logstash'
3. Start the instance based on the command line and use the -e option to specify configuration information (not recommended)
[root@elk3 ~]# logstash -e "input { stdin { type => stdin } } output { stdout { codec => rubydebug } }" -- warn
...
The stdin plugin is now waiting for input:
111111111111111111111111111111111111111
{
"@timestamp" => 2025-03-13T06:51:32.821Z,
"type" => "stdin",
"message" => "111111111111111111111111111111111111111",
"host" => "elk93",
"@version" => "1"
}
4. Start Logstash based on configuration file
[root@elk3 ~]# vim /etc/logstash//
[root@elk3 ~]# cat /etc/logstash//
input {
stdin {
type => stdin
}
}
output {
stdout {
codec => rubydebug
}
}
[root@elk3 ~]# logstash -f /etc/logstash//
...
333333333333333333333333333333
{
"type" => "stdin",
"message" => "333333333333333333333333333333333333333333333333333333333,
"host" => "elk93",
"@timestamp" => 2025-03-13T06:54:20.223Z,
"@version" => "1"
}
# /guide/en/logstash/7.17/
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/tmp/"
}
}
output {
stdout {
codec => rubydebug
}
}
[WARN ] 2025-03-26 09:40:52.788 [[main]<file] plain - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
{
"path" => "/tmp/",
"@version" => "1",
"@timestamp" => 2025-03-26T01:40:52.879Z,
"message" => "aaaddd",
"host" => "elk3"
}
Logstash collection of text log strategies
Logstash acquisition strategy is similar to filebeat acquisition strategy
1. The line break character shall prevail, and the collection shall be carried out according to the behavior unit
2. There is also the concept of offsets similar to filebeat
[root@elk3 ~]# ll /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
-rw-r--r-- 1 root root 53 Mar 26 09:57 /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
[root@elk3 ~]# cat /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
408794 0 64768 12 1742955373.9715059 /tmp/ # 12 is the offset
[root@elk3 ~]# cat /tmp/
ABC
2025def
[root@elk3 ~]# ll -i /tmp/
408794 -rw-r--r-- 1 root root 12 Mar 26 09:45 /tmp/
# You can directly modify the offset for the specified position to collect. We modify the offset to 8 to view the acquisition results
{
"@timestamp" => 2025-03-26T02:20:50.776Z,
"host" => "elk3",
"message" => "def",
"path" => "/tmp/",
"@version" => "1"
}
start_position
If we delete the filebeat json file, the next acquisition of filebeat starts from scratch. This is not the case for logstash.
[root@elk3 ~]# rm -f /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
[root@elk3 ~]# logstash -f /etc/logstash//
# Or collect the last data by default
[root@elk3 ~]# echo 123 >> /tmp/
[root@elk3 ~]# cat /tmp/
ABC
2025def
123
{
"@version" => "1",
"@timestamp" => 2025-03-26T02:26:17.008Z,
"message" => "123",
"host" => "elk3",
"path" => "/tmp/"
}
// This time you need the star_position parameter
start_position
Value can be any of: beginning, end
Default value is "end"
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/tmp/"
start_position => "beginning"
}
}
output {
stdout {
codec => rubydebug
}
}
[root@elk3 ~]# rm -f /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
[root@elk3 ~]# logstash -f /etc/logstash//
{
"@version" => "1",
"host" => "elk3",
"message" => "2025def",
"path" => "/tmp/",
"@timestamp" => 2025-03-26T02:31:50.020Z
}
{
"@version" => "1",
"host" => "elk3",
"message" => "ABC",
"path" => "/tmp/",
"@timestamp" => 2025-03-26T02:31:49.813Z
}
{
"@version" => "1",
"host" => "elk3",
"message" => "123",
"path" => "/tmp/",
"@timestamp" => 2025-03-26T02:31:50.037Z
}
filter plugins
# logstash output has many fields. If there are some I don't want, you can use filter plugins for filtering
# Remove @version field
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/tmp/"
start_position => "beginning"
}
}
filter {
mutate {
remove_field => [ "@version" ]
}
}
output {
stdout {
codec => rubydebug
}
}
# -R mode starts logstash to achieve reload
[root@elk3 ~]# logstash -r -f /etc/logstash//
{
"@timestamp" => 2025-03-26T03:01:02.078Z,
"host" => "elk3",
"message" => "111",
"path" => "/tmp/"
}
logstash architecture
logstash multiple instances
Start Example 1:
[root@elk93 ~]# logstash -f /etc/logstash//
Start Example 2:
[root@elk93 ~]# logstash -rf /etc/logstash// -- /tmp/logstash-multiple
logstash and pipeline relationship
- A Logstash instance can have multiple pipelines. If the pipeline id is not defined, the default is main pipeline.
- Each pipeline consists of three components, where the filter plugin is an optional component:
- input:
Where does the data come from?
- filter:
What plugins are used to process the data? This component is an optional component.
- output:
Where does the data go?
logstash collects nginx logs
1. Install nginx
[root@elk3 ~]# apt install -y nginx
Collect nginx
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/var/log/nginx/"
start_position => "beginning"
}
}
filter {
mutate {
remove_field => [ "@version" ]
}
}
output {
stdout {
codec => rubydebug
}
}
[root@elk3 ~]# logstash -r -f /etc/logstash//
{
"host" => "elk3",
"message" => "127.0.0.1 - - [26/Mar/2025:14:43:58 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"curl/7.81.0\"",
"@timestamp" => 2025-03-26T06:45:24.375Z,
"path" => "/var/log/nginx/"
}
{
"host" => "elk3",
"message" => "127.0.0.1 - - [26/Mar/2025:14:43:57 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"curl/7.81.0\"",
"@timestamp" => 2025-03-26T06:45:24.293Z,
"path" => "/var/log/nginx/"
}
{
"host" => "elk3",
"message" => "127.0.0.1 - - [26/Mar/2025:14:43:58 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"curl/7.81.0\"",
"@timestamp" => 2025-03-26T06:45:24.373Z,
"path" => "/var/log/nginx/"
}
grok plugins
# Based on regular extraction
Logstash ships with about 120 patterns by default. You can find them here: /logstash-plugins/logstash-patterns-core/tree/master/patterns.
[root@elk3 ~]# cat /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.3.4/patterns/legacy/httpd
HTTPDUSER %{EMAILADDRESS}|%{USER}
HTTPDERROR_DATE %{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}
# Log formats
HTTPD_COMMONLOG %{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes})
HTTPD_COMBINEDLOG %{HTTPD_COMMONLOG} %{QS:referrer} %{QS:agent}
# Error logs
HTTPD20_ERRORLOG \[%{HTTPDERROR_DATE:timestamp}\] \[%{LOGLEVEL:loglevel}\] (?:\[client %{IPORHOST:clientip}\] ){0,1}%{GREEDYDATA:message}
HTTPD24_ERRORLOG \[%{HTTPDERROR_DATE:timestamp}\] \[(?:%{WORD:module})?:%{LOGLEVEL:loglevel}\] \[pid %{POSINT:pid}(:tid %{NUMBER:tid})?\]( \(%{POSINT:proxy_errorcode}\)%{DATA:proxy_message}:)?( \[client %{IPORHOST:clientip}:%{POSINT:clientport}\])?( %{DATA:errorcode}:)? %{GREEDYDATA:message}
HTTPD_ERRORLOG %{HTTPD20_ERRORLOG}|%{HTTPD24_ERRORLOG}
# Deprecated
COMMONAPACHELOG %{HTTPD_COMMONLOG}
COMBINEDAPACHELOG %{HTTPD_COMBINEDLOG}
# Configure logstash
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/var/log/nginx/"
start_position => "beginning"
}
}
filter {
mutate {
remove_field => [ "@version" ]
}
# Extract arbitrary text based on regularity and encapsulate it into a specific field. Use the set template
grok {
match => { "message" => "%{HTTPD_COMMONLOG}" }
}
}
output {
stdout {
codec => rubydebug
}
}
[root@elk3 ~]# logstash -r -f /etc/logstash//
{
"message" => "192.168.121.1 - - [26/Mar/2025:14:52:06 +0800] \"GET / HTTP/1.1\" 200 396 \"-\" \"Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Mobile Safari/537.36\"",
"path" => "/var/log/nginx/",
"request" => "/",
"clientip" => "192.168.121.1",
"host" => "elk3",
"timestamp" => "26/Mar/2025:14:52:06 +0800",
"auth" => "-",
"verb" => "GET",
"response" => "200",
"ident" => "-",
"httpversion" => "1.1",
"@timestamp" => 2025-03-26T06:52:07.342Z,
"bytes" => "396"
}
useragent plugins
Used to extract user's device information
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/var/log/nginx/"
start_position => "beginning"
}
}
filter {
mutate {
remove_field => [ "@version" ]
}
grok {
match => { "message" => "%{HTTPD_COMMONLOG}" }
}
useragent {
# Specify which field to parse user device information
source => 'message'
# Store the parsed results in a specific field. If not specified, it will be placed in the top-level field by default.
target => "xu-ua"
}
}
[root@elk3 ~]# logstash -r -f /etc/logstash//
output {
stdout {
codec => rubydebug
}
}
{
"message" => "192.168.121.1 - - [26/Mar/2025:16:45:10 +0800] \"GET / HTTP/1.1\" 200 396 \"-\" \"Mozilla/5.0 (Linux; Android 13; SM-G981B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Mobile Safari/537.36\"",
"clientip" => "192.168.121.1",
"timestamp" => "26/Mar/2025:16:45:10 +0800",
"request" => "/",
"bytes" => "396",
"verb" => "GET",
"httpversion" => "1.1",
"@timestamp" => 2025-03-26T08:45:11.587Z,
"host" => "elk3",
"auth" => "-",
"xu-ua" => {
"name" => "Chrome Mobile",
"version" => "134.0.0.0",
"os" => "Android",
"os_name" => "Android",
"os_version" => "13",
"device" => "Samsung SM-G981B",
"os_full" => "Android 13",
"minor" => "0",
"os_major" => "13",
"patch" => "0",
"major" => "134"
},
"ident" => "-",
"path" => "/var/log/nginx/",
"response" => "200"
}
geoip plugins
Analyze your latitude and longitude coordinate points based on public network IP addresses
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/var/log/nginx/"
start_position => "beginning"
}
}
filter {
mutate {
remove_field => [ "@version" ]
}
grok {
match => { "message" => "%{HTTPD_COMMONLOG}" }
}
useragent {
source => 'message'
target => "xu-ua"
}
geoip {
source => "clientip"
}
}
[root@elk3 ~]# logstash -r -f /etc/logstash//
output {
stdout {
codec => rubydebug
}
}
"geoip" => {
"longitude" => -119.705,
"country_code2" => "US",
"region_name" => "Oregon",
"timezone" => "America/Los_Angeles",
"ip" => "52.222.36.125",
"continent_code" => "NA",
"country_code3" => "US",
"latitude" => 45.8401,
"country_name" => "United States",
"dma_code" => 810,
"postal_code" => "97818",
"region_code" => "OR",
"location" => {
"lat" => 45.8401,
"lon" => -119.705
},
"city_name" => "Boardman"
}
date plugins
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/var/log/nginx/"
start_position => "beginning"
}
}
filter {
mutate {
remove_field => [ "@version" ]
}
grok {
match => { "message" => "%{HTTPD_COMMONLOG}" }
}
useragent {
source => 'message'
target => "xu-ua"
}
geoip {
source => "clientip"
}
date {
# Match the date field, convert it to date format, and store it in ES in the future, and use the corresponding format based on the official example.
# /guide/en/logstash/7.17/#plugins-filters-date-match
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
# Overwrite the modified value of the date matched to the specified field directly. If it is not defined, the default is overridden "@timestamp".
target => "xu-timestamp"
}
}
output {
stdout {
codec => rubydebug
}
}
[root@elk3 ~]# logstash -r -f /etc/logstash//
"xu-timestamp" => 2025-03-26T09:17:18.000Z,
mutate plugins
If we want to count the bandwidth we will find "bytes" => "396"
It is a string type and cannot be accumulated, so you need to use mutate plugins to convert the type
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/var/log/nginx/"
start_position => "beginning"
}
}
filter {
mutate {
convert => {
"bytes" => "integer"
}
remove_field => [ "@version" ]
}
grok {
match => { "message" => "%{HTTPD_COMMONLOG}" }
}
useragent {
source => 'message'
target => "xu-ua"
}
geoip {
source => "clientip"
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
target => "xu-timestamp"
}
}
output {
stdout {
codec => rubydebug
}
}
logstash collect log output to es
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/var/log/nginx/"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{HTTPD_COMMONLOG}" }
}
useragent {
source => "message"
target => "xu_user_agent"
}
geoip {
source => "clientip"
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
target => "xu-timestamp"
}
# Convert the specified fields
mutate {
# Convert the specified field to the type we need to convert
convert => {
"bytes" => "integer"
}
remove_field => [ "@version","host","message" ]
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
# List of corresponding ES cluster hosts
hosts => ["10.0.0.91:9200","10.0.0.92:9200","10.0.0.93:9200"]
# The index name of the corresponding ES cluster
index => "xu-elk-nginx"
}
}
Existing problems:
Failed (timed out waiting for connection to open). Sleeping for 0.02
Question description:
This issue is in ElasticStack version 7.17.28, and Logstash cannot write to ES.
TODO:
It is necessary to investigate whether the official has made changes, which cannot be successfully written and requires additional parameter configuration.
Temporary Solution:
- Rewind version to version 7.17.23.
- Comment out the geoip configuration
Solve the problem of the long recognition time of geoip plugins when writing es
By checking the official website, we can see that the geoip module can specify the database. We solve this problem by specifying the database.
1. Check out Logstash local default geoip plugin
[root@elk3 ~]# tree /usr/share/logstash/data/plugins/filters/geoip/1742980310/
/usr/share/logstash/data/plugins/filters/geoip/1742980310/
├──
├──
├──
├──
├──
├──
├──
└──
0 directories, 8 files
2. Configure logstash
[root@elk3 ~]# cat /etc/logstash//
input {
file {
path => "/var/log/nginx/"
start_position => "beginning"
}
}
filter {
mutate {
convert => {
"bytes" => "integer"
}
remove_field => [ "@version" ]
}
grok {
match => { "message" => "%{HTTPD_COMMONLOG}" }
}
useragent {
source => 'message'
target => "xu-ua"
}
geoip {
source => "clientip"
database => "/usr/share/logstash/data/plugins/filters/geoip/1742980310/"
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
target => "xu-timestamp"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
index => "xu-logstash"
hosts => ["http://192.168.121.91:9200","http://192.168.121.92:9200","http://192.168.121.93:9200"]
}
}
[root@elk3 ~]#
Solve the problem of incorrect data type
At this time, the latitude and longitude are float type, and the picture cannot be produced.
Create index templates in kibana
ELFK architecture
json plugin case
graph LR
filebeat--->|Send|logstash
Let logstash receive data collected by filebeat, logstash takes precedence over filebeat startup
That is, the input plugins type in logstash is beats
# Configure logstash
[root@elk3 ~]# grep -v "^#" /etc/logstash//
input {
beats {
port => 5044
}
}
filter {
mutate {
remove_field => [ "@version","host","agent","ecs","tags","input","log" ]
}
json {
source => "message"
}
}
output {
stdout {
codec => rubydebug
}
}
# Configure filebeat
[root@elk1 ~]# cat /etc/filebeat/config/
:
- type: filestream
paths:
- /tmp/
:
hosts: ["192.168.121.93:5044"]
# Start logstash first, start filebeat
[root@elk3]# logstash -rf
[root@elk3 ~]# netstat -tunlp | grep 5044
tcp6 0 0 :::5044 :::* LISTEN 120181/java
# Start filebeat
[root@elk1 ~]# filebeat -e -c /etc/filebeat/config/
// Prepare test data
{
"name":"aaa",
"hobby":["Writing novels", "Singing"]
}
{
"name":"bbb",
"hobby":["Fitness", "Billiards", "Dou Dou"]
}
{
"name":"ccc",
"hobby":["Table tennis", "Swimming", "Game"]
}
{
"name": "ddd",
"hobby": ["playing games", "playing basketball"]
}
# Check the collection results. Since the collection rules of filebeat are collected by row, it has collected multiple pieces of data we prepared.
"message" => " \"name\": \"dd\",",
"message" => " \"hobby\": [\"play game\",\"play basketball\"]",
...
# Multi-line merge of filebeat is required for processing
[root@elk1 ~]# cat /etc/filebeat/config/
:
- type: filestream
paths:
- /tmp/
parsers:
- multiline:
type: count
count_lines: 4
:
hosts: ["192.168.121.93:5044"]
# Check the data collection situation
{
"message" => "{\n \"name\":\"aaa\",\n \"hobby\":[\"Writing novels\",\"Singing\"]\n}",
"name" => "aaa",
"hobby" => [
[0] "Writing a novel",
[1] "Singing"
],
"@timestamp" => 2025-03-27T07:46:14.390Z
}
Write to es
[root@elk3 ~]# grep -v "^#" /etc/logstash//
input {
beats {
port => 5044
}
}
filter {
mutate {
remove_field => [ "@version","host","agent","ecs","tags","input","log" ]
}
json {
source => "message"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => ["http://192.168.121.91:9200"]
}
}
ELFK architecture sorting out e-commerce indicator sharing project cases
1. Generate test data
[root@elk1 ~]# cat
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# @author : Jason Yin
import datetime
import random
import logging
import time
import sys
LOG_FORMAT = "%(levelname)s %(asctime)s [.%(module)s] - %(message)s "
DATE_FORMAT = "%Y-%m-%d %H:%M:%S"
# Configure the basic configuration of root instance
(level=, format=LOG_FORMAT, datefmt=DATE_FORMAT, filename=[1]
, filemode='a',)
actions = ["Browse page", "Comment item", "Add to favorites", "Add to cart", "Submit order", "Use coupons", "Receive coupons",
"Search", "View Order", "Pay", "Clear Shopping Cart"]
While True:
((1, 5))
user_id = (1, 10000)
# Keep 2 significant digits for the generated floating point number.
price = round((15000, 30000),2)
action = (actions)
svip = ([0,1,2])
("DAU|{0}|{1}|{2}|{3}".format(user_id, action,svip,price))
[root@elk1 ~]# python3 /tmp/
2. View data content
[root@elk1 ~]# tail -f /tmp/
...
INFO 2025-03-27 17:03:10 [-log] - DAU|7973|Add to Cart|0|19300.65
INFO 2025-03-27 17:03:13 [-log] - DAU|8617|Add to Cart|2|19720.57
INFO 2025-03-27 17:03:14 [-log] - DAU|6879|Search|2|24774.85
INFO 2025-03-27 17:03:19 [-log] - DAU|804| Reading|2|21352.22
INFO 2025-03-27 17:03:22 [-log] - DAU|3014|Clear the shopping cart|0|19908.62
...
# Start logstash instance
[root@elk3]# cat 06-beats_apps
input {
beats {
port => 9999
}
}
filter {
mutate {
split => { "message" => "|" }
add_field => {
"other" => "%{[message][0]}"
"userId" => "%{[message][1]}"
"action" => "%{[message][2]}"
"svip" => "%{[message][3]}"
"price" => "%{[message][4]}"
}
}
mutate{
split => { "other" => " " }
add_field => {
datetime => "%{[other][1]} %{[other][2]}"
}
convert => {
"price" => "float"
}
remove_field => [ "@version","host","agent","ecs","tags","input","log","message","other"]
}
}
output {
# stdout {
# codec => rubydebug
# }
elasticsearch {
index => "linux96-logstash-elfk-apps"
hosts => ["http://192.168.121.91:9200","http://192.168.121.92:9200","http://192.168.121.93:9200"]
}
}
# Start filebeat instance
[root@elk1 ~]# cat /etc/filebeat/config/
:
- type: filestream
paths:
- /tmp/
:
hosts: ["192.168.121.93:9999"]
ELK architecture
logstash if statement
If statements are supported in logstash. If there are multiple inputs, different filters can be performed through if and different outputs.
# Configure logstash if
[root@elk3 ~]# cat /etc/logstash//
input {
beats {
port => 9999
type => "xu-filebeat"
}
file {
path => "/var/log/nginx/"
start_position => "beginning"
type => "xu-file"
}
tcp {
port => 8888
type => "xu-tcp"
}
}
fileter {
if [type] == "xu-tcp" {
mutate {
add_field => {
school => "school1"
class => "one"
}
remove_field => [ "@version","port"]
}
} else if [type] == "xu-filebeat" {
mutate {
split => { "message" => "|" }
add_field => {
"other" => "%{[message][0]}"
"userId" => "%{[message][1]}"
"action" => "%{[message][2]}"
"svip" => "%{[message][3]}"
"price" => "%{[message][4]}"
"address" => "1.1.1.1"
}
}
mutate {
split => { "other" => " " }
add_field => {
datetime => "%{[other][1]} %{[other][2]}"
}
convert => {
"price" => "float"
}
remove_field => [ "@version","host","agent","ecs","tags","input","log","message","other"]
}
date {
match => [ "datetime", "yyyy-MM-dd HH:mm:ss" ]
}
} else {
grok {
match => { "message" => "%{HTTPD_COMMONLOG}" }
}
useragent {
source => "message"
target => "xu_user_agent"
}
geoip {
source => "clientip"
database => "/usr/share/logstash/data/plugins/filters/geoip/CC/"
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
target => "xu-timestamp"
}
mutate {
convert => {
"bytes" => "integer"
}
add_field => {
office => ""
}
remove_field => [ "@version","host","message" ]
}
}
}
output {
if [type] == "xu-filebeat" {
elasticsearch {
index => "xu-logstash-if-filebeat"
hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
}
} else if [type] == "xu-tcp" {
elasticsearch {
index => "xu-logstash-if-tcp"
hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
}
}else {
elasticsearch {
index => "xu-logstash-if-file"
hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
}
}
}
[root@elk3 ~]#
pipeline
#pipline configuration file location
[root@elk3 ~]# ll /etc/logstash/
-rw-r--r-- 1 root root 285 Feb 18 18:52 /etc/logstash/
# Modify the pipline configuration file
[root@elk3 ~]# tail -4 /etc/logstash/
- : xixi
: "/etc/logstash//"
- : haha
: "/etc/logstash//"
# Start logstash, you can start directly through logstash -r without specifying the configuration file
[root@elk3 ~]# logstash -r
# There will be an error directly ERROR: Failed to read pipelines yaml file. Location: /usr/share/logstash/config/
# logstash will search for this file by default in /usr/share/logstash/config/. At this time, we can make a soft link.
# Configure soft links
[root@elk3 ~]# mkdir /usr/share/logstash/config/
[root@elk3 ~]# ln -svf /etc/logstash/ /usr/share/logstash/config/
'/usr/share/logstash/config/' -> '/etc/logstash/'
[root@elk3 ~]# logstash -r
...
[INFO ] 2025-03-29 10:16:50.372 [[xixi]-pipeline-manager] javapipeline - Pipeline started {""=>"xixi"}
[INFO ] 2025-03-29 10:16:54.380 [[haha]-pipeline-manager] javapipeline - Pipeline started {""=>"haha"}
...
ES cluster security
Base_auth encryption based
es cluster encryption
# It can be accessed normally before configuring encryption
[root@elk1 ~]# curl 127.1:9200/_cat/nodes
192.168.121.92 6 94 0 0.05 0.03 0.00 cdfhilmrstw - elk2
192.168.121.91 22 95 4 0.25 0.27 0.25 cdfhilmrstw * elk1
192.168.121.93 29 94 6 0.02 0.25 0.48 cdfhilmrstw - elk3
1 Generate certificate file
[root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil cert -out /etc/elasticsearch/elastic-certificates.p12 -pass ""
...
Certificates written to /etc/elasticsearch/elastic-certificates.p12
This file should be properly secured as it contains the private key for
your instance.
This file is a self contained file and can be copied and used 'as is'
For each Elastic product that you wish to configure, you should copy
this '.p12' file to the relevant configuration directory
and then follow the SSL configuration instructions in the product guide.
2. Copy the certificate file to other nodes
[root@elk1 ~]# chmod 640 /etc/elasticsearch/elastic-certificates.p12
[root@elk1 ~]# scp -r /etc/elasticsearch/elastic-certificates.p12 192.168.121.92:/etc/elasticsearch/elastic-certificates.p12
[root@elk1 ~]# scp - /etc/elasticsearch/elastic-certificates.p12 192.168.121.93:/etc/elasticsearch/elastic-certificates.p12
3. Modify the configuration file of the ES cluster and synchronize it to all nodes
[root@elk1 ~]# tail -5 /etc/elasticsearch/
: true
: true
.verification_mode: certificate
: elastic-certificates.p12
: elastic-certificates.p12
[root@elk1 ~]# scp /etc/elasticsearch/ 192.168.121.92:/etc/elasticsearch/
[root@elk1 ~]# scp /etc/elasticsearch/ 192.168.121.93:/etc/elasticsearch/
4. Restart es
[root@elk1 ~]# systemctl restart
# You can't access directly at this time
[root@elk1 ~]# curl 127.1:9200
{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}},"status":401}
5. Generate random passwords
[root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-setup-passwords auto
Changed password for user apm_system
PASSWORD apm_system = aBsQ3WI9ydUVTx2hk2JT
Changed password for user kibana_system
PASSWORD kibana_system = xoMBWbFyYmadDyrYcwyI
Changed password for user kibana
PASSWORD kibana = xoMBWbFyYmadDyrYcwyI
Changed password for user logstash_system
PASSWORD logstash_system = fWx19jXFHinpraglh8E
Changed password for user beats_system
PASSWORD beats_system = NgKipgH0LfnFGFAazun6
Changed password for user remote_monitoring_user
PASSWORD remote_monitoring_user = Af4hu6PrhPYvn2S5zcEj
Changed password for user elastic
PASSWORD elastic = 0Nj2dpMTSNYurPqQHInA
[root@elk1 ~]# curl -u elastic:MSfRhWKA3lRhufYpxF9u 127.1:9200/_cat/nodes
192.168.121.91 40 96 22 0.62 0.74 0.53 cdfhilmrstw - elk1
192.168.121.92 17 96 20 0.44 0.67 0.36 cdfhilmrstw * elk2
192.168.121.93 23 96 32 0.54 1.00 0.73 cdfhilmrstw - elk3
Connect es
6.1 Modify the kibana configuration file
[root@elk1 ~]# tail -2 /etc/kibana/
: "kibana_system"
: "47UD4ZOypuWO100QciH4"
6.2 Restart kibana
[root@elk1 ~]# systemctl restart
6.3web access kibana
Reset es password
There is a role similar to the root user in the es cluster. We can create a role that belongs to the superuser and modify the elastic password through this user.
1. Create a Super Administrator Role
[root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-users useradd xu -p 123456 -r superuser
2. Modify password based on administrator
[root@elk1 ~]# curl -s --user xu:123456 -XPUT "http://localhost:9200/_xpack/security/user/elastic/_password?pretty" -H 'Content-Type: application/json' -d'
{
"password" : "654321"
}'
[root@elk1 ~]# curl -uelastic:654321 127.1:9200/_cat/nodes
192.168.121.91 35 96 7 0.38 0.43 0.52 cdfhilmrstw - elk1
192.168.121.92 20 96 2 0.20 0.20 0.25 cdfhilmrstw * elk2
192.168.121.93 27 97 5 0.10 0.18 0.38 cdfhilmrstw - elk3
filebeat docking es encryption
[root@elk1 ~]# cat /etc/filebeat/config/07-tcp-to-es_tls.yaml
:
- type: tcp
host: "0.0.0.0:9000"
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
# Specify the user name to connect to the ES cluster
username: "elastic"
# Specify the password to connect to the ES cluster
password: "654321"
index: xu-es-tls-filebeat
: false
: "xu-es-tls-filebeat"
: "xu-es-tls-filebeat-*"
: true
:
index.number_of_shards: 3
index.number_of_replicas: 0
logstash docking es encryption
[root@elk3 ~]# cat /etc/logstash//09-tcp-to-es_tls.conf
input {
tcp {
port => 8888
}
}
output {
elasticsearch {
hosts => ["192.168.121.91:9200","192.168.121.92:9200","192.168.121.93:9200"]
index => "oldboyedu-logstash-tls-es"
user => elastic
password => "654321"
}
}
api-key
Why enable api-key
For security, authentication using username and password will expose user information.
ElasticSearch also supports api-key authentication. This ensures safety. api-key cannot be used to log in to kibana, and its security is guaranteed.
And permission control can be implemented based on api-key.
By default, elasticsearch does not start the API, and it is necessary to set and start the API function through the configuration file.
Start the es api function
[root@elk1 ~]# tail /etc/elasticsearch/
# Enable api_key function
.api_key.enabled: true
# Specify API key encryption algorithm
.api_key.: pbkdf2
# cached API key time
.api_key.: 1d
# Upper limit of the number of API keys saved
.api_key.cache.max_keys: 10000
# Hash algorithm for API key credentials cached in memory
.api_key.cache.hash_algo: ssha256
[root@elk1 ~]# !scp
scp /etc/elasticsearch/ 192.168.121.93:/etc/elasticsearch/
[email protected]'s password:
100% 4270 949.6KB/s 00:00
[root@elk1 ~]# scp /etc/elasticsearch/ 192.168.121.92:/etc/elasticsearch/
[email protected]'s password:
[root@elk1 ~]# systemctl restart
Create API
# parse api
[root@elk1 ~]# echo "TzBCTzY1VUJiWUdnVHlBNjZRTXc6eE9JWW9wT3dTT09Sam1UNE5RYnRjUQ==" | base64 -d ;echo
O0BO65UBbYGgTyA66QMw:xOIYopOwSOORjmT4NQbtcQ
# Configure filebeat
[root@elk1 ~]# cat /etc/filebeat/config/07-tcp-to-es_tls.yaml
:
- type: tcp
host: "0.0.0.0:9000"
:
hosts:
- 192.168.121.91:9200
- 192.168.121.92:9200
- 192.168.121.93:9200
#username: "elastic"
#password: "654321"
api_key: zvWA4JUBqFmHNaf3P8bM:d-goeFONRPelMuRxSr2Bxg
index: xu-es-tls-filebeat
: false
: "xu-es-tls-filebeat"
: "xu-es-tls-filebeat-*"
: true
:
index.number_of_shards: 3
index.number_of_replicas: 0
[root@elk1 ~]# filebeat -e -c /etc/filebeat/config/07-tcp-to-es_tls.yaml
Create api-key and implement permission management based on ES
Reference link:
/guide/en/beats/filebeat/7.17/
/guide/en/elasticsearch/reference/7.17/#privileges-list-cluster
/guide/en/elasticsearch/reference/7.17/#privileges-list-indices
1. Create api-key
# Send a request
POST /_security/api_key
{
"name": "jasonyin2020",
"role_descriptors": {
"filebeat_monitoring": {
"cluster": ["all"],
"index": [
{
"names": ["xu-es-apikey*"],
"privileges": ["create_index", "create"]
}
]
}
}
}
# Return data
{
"id" : "0vXs4ZUBqFmHNaf3s8Zn",
"name" : "jasonyin2020",
"api_key" : "y1Vi5fL6RfGy_B47YWBXcw",
"encoded" : "MHZYczRaVUJxRm1ITmFmM3M4Wm46eTFWaTVmTDZSZkd5X0I0N1lXQlhjdw=="
}
#Analysis
[root@elk1 ~]# echo MHZYczRaVUJxRm1ITmFmM3M4Wm46eTFWaTVmTDZSZkd5X0I0N1lXQlhjdw== | base64 -d ;echo
0vXs4ZUBqFmHNaf3s8Zn:y1Vi5fL6RfGy_B47YWBXcw
https
es cluster configuration https
1. Build a self-built CA certificate
[root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil ca --out /etc/elasticsearch/elastic-stack-ca.p12 --pass ""
[root@elk1 ~]# ll /etc/elasticsearch/elastic-stack-ca.p12
-rw------- 1 root elasticsearch 2672 Mar 29 20:44 /etc/elasticsearch/elastic-stack-ca.p12
2. Create ES certificate based on CA certificate
[root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca /etc/elasticsearch/elastic-stack-ca.p12 --out /etc/elasticsearch/elastic-certificates-https.p12 --pass "" --days 3650 --ca-pass ""
[root@elk1 ~]# ll /etc/elasticsearch/elastic-stack-ca.p12
-rw------- 1 root elasticsearch 2672 Mar 29 20:44 /etc/elasticsearch/elastic-stack-ca.p12
[root@elk1 ~]# ll /etc/elasticsearch/elastic-certificates-https.p12
-rw------- 1 root elasticsearch 3596 Mar 29 20:48 /etc/elasticsearch/elastic-certificates-https.p12
3. Modify the configuration file
[root@elk1 ~]# tail -2 /etc/elasticsearch/
: true
: elastic-certificates-https.p12
[root@elk1 ~]# chmod 640 /etc/elasticsearch/elastic-certificates-https.p12
[root@elk1 ~]# scp -rp /etc/elasticsearch/elastic{-certificates-https.p12,} 192.168.121.92:/etc/elasticsearch/
[email protected]'s password:
elastic-certificates-https.p12 100% 3596 1.6MB/s 00:00
100% 4378 6.0MB/s 00:00
[root@elk1 ~]# scp -rp /etc/elasticsearch/elastic{-certificates-https.p12,} 192.168.121.93:/etc/elasticsearch/
[email protected]'s password:
elastic-certificates-https.p12 100% 3596 894.2KB/s 00:00
4. Restart the ES cluster
[root@elk1 ~]# systemctl restart
[root@elk2 ~]# systemctl restart
[root@elk3 ~]# systemctl restart
[root@elk1 ~]# curl https://127.1:9200/_cat/nodes -u elastic:654321 -k
192.168.121.92 16 94 63 1.88 0.92 0.35 cdfhilmrstw - elk2
192.168.121.91 14 96 30 0.79 0.90 0.55 cdfhilmrstw * elk1
192.168.121.93 8 97 53 1.22 0.71 0.33 cdfhilmrstw - elk3
5. Modify the configuration of kibana to skip self-built certificate verification
[root@elk1 ~]# vim /etc/kibana/
...
# The address protocol pointing to the ES cluster is https
: ["https://192.168.121.91:9200","https://192.168.121.92:9200","https://192.168.121.93:9200"]
# Skip certificate verification
: none
[root@elk1 ~]# systemctl restart
filebeat docking https encryption
# Write filebeat configuration file
[root@elk92 filebeat]# cat
:
- type: tcp
host: "0.0.0.0:9000"
:
hosts:
- https://192.168.121.91:9200
- https://192.168.121.92:9200
- https://192.168.121.93:9200
api_key: "m1wPlJUBrDbi_DeiIc-1:RcEw7Mk2QQKH_CGhMBnfbg"
index: xu-es-apikey-tls-2025
# Configure tls for es cluster, skip certificate verification here. The default value is: full
# Reference link:
# /guide/en/beats/filebeat/7.17/#client-verification-mode
ssl.verification_mode: none
: false
: "xu"
: "xu*"
: true
:
index.number_of_shards: 3
index.number_of_replicas: 0
logstash docking https encryption
[root@elk93 logstash]# cat 13-tcp-to-es_api
input {
tcp {
port => 8888
}
}
output {
elasticsearch {
hosts => ["192.168.121.91:9200","192.168.121.92:9200","192.168.121.93:9200"]
index => "xu-api-key"
#user => elastic
#password => "123456"xu
# Specify the api-key authentication method
api_key => "oFwZlJUBrDbi_DeiLc9O:HWBj0LC2RWiUNTudV-6CBw"
# Use api-key to start ssl
ssl => true
# Skip SSL certificate verification
ssl_certificate_verification => false
}
}
[root@elk93 logstash]#
[root@elk93 logstash]# logstash -rf 13-tcp-to-es_api
Implementing RBAC based on kibana
Reference link:
/guide/en/elasticsearch/reference/7.17/
Create a role
Create a user
ES8 deployment
Single-point deployment of ES8 clusters
Environmental preparation:
192.168.121.191 elk191
192.168.121.192 elk192
192.168.121.193 elk193
1. Get the installation package and install es8
[root@elk191 ~]# wget /downloads/elasticsearch/elasticsearch-8.17.
[root@elk191 ~]# dpkg -i elasticsearch-8.17.
# es8 supports https by default
--------------------------- Security autoconfiguration information ------------------------------
Authentication and authorization are enabled.
TLS for the transport and HTTP layers is enabled and configured.
The generated password for the elastic built-in superuser is: P0-MRYuCOTFj*4*rGNZk # The built-in elastic superuser password
If this node should join an existing cluster, you can reconfigure this with
'/usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token <token-here>'
After creating an enrollment token on your existing cluster.
You can complete the following actions at any time:
Reset the password of the elastic built-in superuser with
'/usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic'.
Generate an enrollment token for Kibana instances with
'/usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana'.
Generate an enrollment token for Elasticsearch nodes with
'/usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node'.
-------------------------------------------------------------------------------------------------
### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd
sudo systemctl daemon-reload
sudo systemctl enable
### You can start elasticsearch service by executing
sudo systemctl start
2. Start es8
[root@elk191 ~]# systemctl enable --now
Created symlink /etc/systemd/system// → /lib/systemd/system/.
[root@elk191 ~]# netstat -tunlp | grep -E "9[2|3]00"
tcp6 0 0 127.0.0.1:9300 :::* LISTEN 1669/java
tcp6 0 0 ::1:9300 :::* LISTEN 1669/java
tcp6 0 0 :::9200 :::* LISTEN 1669/java
3. Test access
[root@elk191 ~]# curl -u elastic:NVPLcMy0_n8aGL=UGAGc https://127.1:9200 -k
{
"name" : "elk191",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "-cw1TGvZSau0J2x-ThOJsg",
"version" : {
"number" : "8.17.3",
"build_flavor" : "default",
"build_type" : "deb",
"build_hash" : "a091390de485bd4b127884f7e565c0cad59b10d2",
"build_date" : "2025-02-28T10:07:26.089129809Z",
"build_snapshot" : false,
"lucene_version" : "9.12.0",
"minimum_wire_compatibility_version" : "7.17.0",
"minimum_index_compatibility_version" : "7.0.0"
},
"tagline" : "You Know, for Search"
}
[root@elk191 ~]# curl -u elastic:NVPLcMy0_n8aGL=UGAGc https://127.1:9200/_cat/nodes -k
127.0.0.1 9 97 13 0.35 0.59 0.31 cdfhilmrstw * elk191
Deploy kibana8
1. Get the installation package and install kibana
[root@elk191 ~]# wget /downloads/kibana/kibana-8.17.
[root@elk191 ~]# dpkg -i kibana-8.17.
2. Configure kibana
[root@elk191 ~]# grep -vE "^$|^#" /etc/kibana/
: 5601
: "0.0.0.0"
logging:
appenders:
file:
type: file
fileName: /var/log/kibana/
layout:
type: json
root:
appenders:
- default
- file
: /run/kibana/
: "zh-CN"
3. Start kibana
[root@elk191 ~]# systemctl enable --now
[root@elk191 ~]# ss -ntl | grep 5601
LISTEN 0 511 0.0.0.0:5601 0.0.0.0:*
4. Generate a kibana dedicated token
[root@elk191 ~]# /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana
eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTkyLjE2OC4xMjEuMTkxOjkyMDAiXSwiZmdyIjoiZmNjMWI3MzJlNzIwMzMzMjI0ZDc5Zjk1YTUyZjIzZmUyNjMzMzYwZDIxY2Q0NzY3YjQ2ZjExZDhiOGYxZTFlZiIsImtleSI6IjdjNTk3SlVCeEI5S3NHd1ZPWVQ5OmYtN0FRWkhEUTVtMnlCZXdiMnJLbXcifQ==
The server obtains the verification code
[root@elk191 ~]# /usr/share/kibana/bin/kibana-verification-code
Your verification code is: 414 756
es8 cluster deployment
1. Copy the configuration file to other nodes
[root@elk191 ~]# scp elasticsearch-8.17. 10.0.0.192:~
[root@elk191 ~]# scp elasticsearch-8.17. 10.0.0.193:~
2. Install ES8 software packages from other nodes
[root@elk192 ~]# dpkg -i elasticsearch-8.17.
[root@elk193 ~]# dpkg -i elasticsearch-8.17.
#Configuration es8
[root@elk191 ~]# grep -Ev "^$|^#" /etc/elasticsearch/
: xu-application
: /var/lib/elasticsearch
: /var/log/elasticsearch
: 0.0.0.0
discovery.seed_hosts: ["192.168.121.191","192.168.121.192","192.168.121.193"]
cluster.initial_master_nodes: ["192.168.121.191","192.168.121.192","192.168.121.193"]
: true
: true
:
enabled: true
: certs/http.p12
:
enabled: true
verification_mode: certificate
: certs/transport.p12
: certs/transport.p12
: 0.0.0.0
3. Generate token token file in any node of the existing cluster
[root@elk191 ~]# /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node
4. Use token to reconfigure the configuration file of the new node when the node is to be joined
[root@elk192 ~]# /usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTkyLjE2OC4xMjEuMTkxOjkyMDAiXSwiZmdyIjoiMzIwODY0YzMxNmEyMDQ4YmIwYzVjNDNhY2FlZjQ4MTg2OTM3MmVhNTg2NjdiYTAwMjBjN2Y2ZTczN2YzNWU0MCIsImtleSI6IkE3RTY4SlVCU1BhTWhMRFN0VWdlOmdaM0dIS0RNUndld3o3ZWM0Qk1ySEEifQ==
[root@elk193 ~]# /usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTkyLjE2OC4xMjEuMTkxOjkyMDAiXSwiZmdyIjoiMzIwODY0YzMxNmEyMDQ4YmIwYzVjNDNhY2FlZjQ4MTg2OTM3MmVhNTg2NjdiYTAwMjBjN2Y2ZTczN2YzNWU0MCIsImtleSI6IkE3RTY4SlVCU1BhTWhMRFN0VWdlOmdaM0dIS0RNUndld3o3ZWM0Qk1ySEEifQ==
5. Synchronize configuration files
[root@elk191 ~]# scp /etc/elasticsearch/ 192.168.121.192:/etc/elasticsearch/
[root@elk191 ~]# scp /etc/elasticsearch/ 192.168.121.193:/etc/elasticsearch/
6. Start the client es
[root@elk192 ~]# systemctl enable --now
[root@elk193 ~]# systemctl enable --now
7. Access Test
[root@elk193 ~]# curl -u elastic:123456 -k https://192.168.121.191:9200/_cat/nodes
192.168.121.191 17 97 10 0.61 0.55 0.72 cdfhilmrstw * elk191
192.168.121.193 15 97 55 1.72 1.05 0.49 cdfhilmrstw - elk193
192.168.121.192 13 97 4 0.25 0.45 0.52 cdfhilmrstw - elk192
Common Errors
1. Common error handling Q1:
ERROR: Aborting enrolling to cluster. Unable to remove existing secure settings. Error was: Aborting enrolling to cluster. Unable to remove existing security configuration, did not contain expected setting [autoconfiguration.password_hash]., with exit code 74
Problem analysis:
It means that the local security configuration has been set. Delete the previous configuration "".
Solution:
rm -f /etc/elasticsearch/
Common error handling Q2:
ERROR: Skipping security auto configuration because this node is configured to bootstrap or to join a multi-node cluster, which is not supported., with exit code 80
Solution:
export IS_UPGRADE=false
Common error handling Q3:
ERROR: Aborting enrolling to cluster. This node doesn't appear to be auto-configured for security. Expected configuration is missing from ., with exit code 64
Error analysis:
Check the configuration file and find that security-related configuration will be missing, which may be that synchronization has failed.
Solution:
Modify "/etc/elasticsearch/" and add security configuration. You can manually copy the configuration of the elk192 node to the node.
If it still cannot be solved, you can compare the differences between the configuration of the elk191 node and the configuration file of the elk192, and then copy the corresponding configuration. I'm testing here that I'm missing the certs directory.
[root@elk191 ~]# scp -rp /etc/elasticsearch/certs/ 10.0.0.192:/etc/elasticsearch/
[root@elk191 ~]# scp /etc/elasticsearch/ 10.0.0.192:/etc/elasticsearch/
[root@elk191 ~]# scp /etc/elasticsearch/ 10.0.0.192:/etc/elasticsearch/
[root@elk191 ~]# scp -rp /etc/elasticsearch/ 10.0.0.192:/etc/elasticsearch/
ERROR: Aborting enrolling to cluster. Unable to remove existing secure settings. Error was: Aborting enrolling to cluster. Unable to remove existing security configuration, did not contain expected setting [.secure_password]., with exit code 74
The difference between es8 and es7
- Comparative deployment of ES8 and ES7
1. ES8 has enabled https by default, and supports authentication and other functions;
2. ES8 has added the 'elasticsearch-reset-password' script, which is easier for elastic users to reset their passwords;
3. ES8 has added the 'elasticsearch-create-enrollment-token' script, which can create token information for components, such as the kibana component;
4. ES8 added kibana to add 'kibana-verification-code' to generate verification code.
Support more languages: English (default) "en", Chinese "zh-CN", Japanese "ja-JP", French "fr-FR"
The webUI is richer, supports AI assistants, manual index creation and other functions;
7. When deploying ES8 clusters, you need to use the 'elasticsearch-reconfigure-node' script to join the existing cluster. The default is the configuration of a single master node;
ES7 JVM Tuning
By default, half of the memory of the physical machine will be consumed
[root@elk91 ~]# ps -ef | grep java | grep Xms
elastic+ 10045 1 2 Mar14 ? 00:56:32 /usr/share/elasticsearch/jdk/bin/java ... -Xms1937m -Xmx1937m ...
2. About the tuning principle of ES cluster
- The JVM size of the cluster should be half of the physical machine, but not more than 32GB;
- 2. For example, if your cluster memory is 32GB, the default should be 16GB, but if your physical machine is 128GB, it will also eat half of it by default, so we need to manually configure it to 32GB;
3. Set ES memory to 256Mb
[root@elk1 ~]# vim /etc/elasticsearch/
[root@elk1 ~]# egrep "^-Xm[s|x]" /etc/elasticsearch/
-Xms256m
-Xmx256m
4. Copy the configuration file and scroll to restart the ES7 cluster
[root@elk1 ~]# scp /etc/elasticsearch/ 192.168.121.92:/etc/elasticsearch/
100% 3474 2.7MB/s 00:00
[root@elk1 ~]# scp /etc/elasticsearch/ 192.168.121.93:/etc/elasticsearch/
[root@elk1 ~]# systemctl restart
[root@elk2 ~]# systemctl restart
[root@elk3 ~]# systemctl restart
5. Test verification
[root@elk1 ~]# free -h
total used free shared buff/cache available
Mem: 3.8Gi 1.1Gi 1.9Gi 1.0Mi 800Mi 2.4Gi
Swap: 3.8Gi 26Mi 3.8Gi
[root@elk1 ~]# ps -ef | grep java | grep Xms
-Xms256m -Xmx256m
curl -k -u elastic:123456 https://127.1:9200/_cat/nodes
192.168.121.92 68 67 94 4.01 2.12 0.96 cdfhilmrstw * elk2
192.168.121.91 59 56 42 1.72 0.87 0.43 cdfhilmrstw - elk1
192.168.121.93 63 61 92 3.30 2.26 1.14 cdfhilmrstw - elk3