Location>code7788 >text

ElasticStack from Beginner to Mastery

Popularity:809 ℃/2025-04-11 22:46:49

What is ElasticStack

ElasticStack was named elk

elk represents three components

  • ElasticSearch
    Responsible for data storage and retrieval.
  • Logstash
    Responsible for data collection and collecting source data to ElasticSearch for storage.
  • Kibana
    Responsible for displaying data. Similar to Granfa

Since Logstash is a heavyweight product with an installation package of more than 300MB+, many students only use log collection, so they use other acquisition tools instead, such as flume, fluentd and other products to replace it.

Later, elastic company also discovered this problem and developed a lot of beats products, typical of which are Filebeat, metricbeat, heartbeat, etc.

Then, for security, related components such as xpack and cloud environments were launched.

Later, the name was named elk stack (elk technology stack). Later, the company promoted ElasticStack.

ElasticStack Architecture

ElasticStack Version

/Elastic official website

The latest version 8+, the https protocol is enabled by default in version 8. We first install version 7.17 and then manually start the https protocol.

Next, practice installing version 8

Choose the elastic installation method, and then deploy elastic on Ubuntu

Binary package deployment stand-alone es environment

deploy

1. Download the elk installation package
 root@elk:~# cat install_elk.sh
 #!/bin/bash
 wget /downloads/elasticsearch/elasticsearch-7.17.28-linux-x86_64.
 wget /downloads/elasticsearch/elasticsearch-7.17.28-linux-x86_64..sha512
 shasum -a 512 -c elasticsearch-7.17.28-linux-x86_64..sha512
 tar -xzf elasticsearch-7.17.28-linux-x86_64. -C /usr/local
 cd elasticsearch-7.17.28/


 2. Modify the configuration file
 root@elk:~# vim /usr/local/elasticsearch-7.17.28/config/
 root@elk:~# egrep -v "^#|^$" /usr/local/elasticsearch-7.17.28/config/
 : xu-elasticstack
 : /var/lib/es7
 : /var/log/es7
 : 0.0.0.0
 : single-node
 Related parameter description:
	 port
		 The default port is 9200
		 # By default Elasticsearch listens for HTTP traffic on the first free port it
		 # finds starting at 9200. Set a specific HTTP port here:
		 #
		 #: 9200
		
	
		 The name of the cluster
		
	
		 The data storage path of ES.
		
	
		 The log storage path of ES.
		
	
		 # elasticStack only allows native access by default
		 # By default Elasticsearch is only accessible on localhost. Set a different
		 # address here to expose this node on the network:
		 #
		 #: 192.168.0.1

		 The address of the ES service listening.
		
	
		 # If the deployed es cluster needs to be configured with discovery.seed_hosts and cluster.initial_master_nodes parameters
		 # Pass an initial list of hosts to perform discovery when this node is started:
		 # The default list of hosts is ["127.0.0.1", "[::1]"]
		 #
		 #discovery.seed_hosts: ["host1", "host2"]
		 #
		 # Bootstrap the cluster using an initial set of master-eligible nodes:
		 #
		 #cluster.initial_master_nodes: ["node-1", "node-2"]
		 Refers to the deployment type of the ES cluster. Here "single-node" means a single point of environment.
		
 3. If you start elastic directly at this time, an error will be reported
 3.1 Test error, official startup command
 Elasticsearch can be started from the command line as follows:
 ./bin/elasticsearch
 root@elk:~# /usr/local/elasticsearch-7.17.28/bin/elasticsearch
 # These are java types reported errors
 Mar 17, 2025 7:44:51 AM <clinit>
 WARNING: COMPAT locale provider will be removed in a future release
 [2025-03-17T07:44:53,125][ERROR][] [elk] uncaught exception in thread [main]
 : : can not run elasticsearch as root
	 at (:173) ~[elasticsearch-7.17.:7.17.28]
	 at (:160) ~[elasticsearch-7.17.:7.17.28]
	 at (:77) ~[elasticsearch-7.17.:7.17.28]
	 at (:112) ~[elasticsearch-cli-7.17.:7.17.28]
	 at (:77) ~[elasticsearch-cli-7.17.:7.17.28]
	 at (:125) ~[elasticsearch-7.17.:7.17.28]
	 at (:80) ~[elasticsearch-7.17.:7.17.28]
 Caused by: : can not run elasticsearch as root
	 at (:107) ~[elasticsearch-7.17.:7.17.28]
	 at (:183) ~[elasticsearch-7.17.:7.17.28]
	 at (:434) ~[elasticsearch-7.17.:7.17.28]
	 at (:169) ~[elasticsearch-7.17.:7.17.28]
	 ... 6 more
 uncaught exception in thread [main]
 : can not run elasticsearch as root # root is not allowed to start directly
	 at (:107)
	 at (:183)
	 at (:434)
	 at (:169)
	 at (:160)
	 at (:77)
	 at (:112)
	 at (:77)
	 at (:125)
	 at (:80)
 For complete error details, refer to the log at /var/log/es7/
 2025-03-17 07:44:53,713764 UTC [1860] INFO @111 Parent process died - ML controller exiting

 3.2 Create a startup user
 root@elk:~# useradd -m elastic
 root@elk:~# id elastic
 uid=1001(elastic) gid=1001(elastic) groups=1001(elastic)
 # Start with elastic user, there is an error at this time
 root@elk:~# su - elastic -c "/usr/local/elasticsearch-7.17.28/bin/elasticsearch"
 could not find java in bundled JDK at /usr/local/elasticsearch-7.17.28/jdk/bin/java
 # There is a Java package in the system, but the elastic user cannot find it. Switch to elastic to view it.
 root@elk:~# ll /usr/local/elasticsearch-7.17.28/jdk/bin/java
 -rwxr-xr-x 1 root root 12328 Feb 20 09:09 /usr/local/elasticsearch-7.17.28/jdk/bin/java*
 root@elk:~# su - elastic
 $ pwd
 /home/elastic
 $ ls /usr/local/elasticsearch-7.17.28/jdk/bin/java
 # The reason for the error is that the permission is denied, that is, elastic does not have permission to access the java package
 ls: cannot access '/usr/local/elasticsearch-7.17.28/jdk/bin/java': Permission denied
 # Look out layer by layer, and finally you can find the /usr/local/elasticsearch-7.17.28/jdk/bin directory without permission, resulting in an error
 root@elk:~# chown elastic:elastic -R /usr/local/elasticsearch-7.17.28/
 root@elk:~# ll -d /usr/local/elasticsearch-7.17.28/jdk/bin/
 drwxr-x--- 2 elastic elastic 4096 Feb 20 09:09 /usr/local/elasticsearch-7.17.28/jdk/bin//

 # Then start the test and find other errors
 # We specify the location and do not exist, we need to create it manually
 : Unable to access '' (/var/lib/es7)
 : : Unable to access '' (/var/lib/es7)
 root@elk:~# install -d /var/{log,lib}/es7 -o elastic -g elastic
 root@elk:~# ll -d /var/{log,lib}/es7
 drwxr-xr-x 2 elastic elastic 4096 Mar 17 08:01 /var/lib/es7/
 drwxr-xr-x 2 elastic elastic 4096 Mar 17 07:44 /var/log/es7/

 # Now restart the service, the startup can be successfully launched, and the port is detected
 root@elk:~# su - elastic -c "/usr/local/elasticsearch-7.17.28/bin/elasticsearch"
 root@elk:~# netstat -tunlp | egrep "9[2|3]00"
 tcp6 0 0 :::9200 :::* LISTEN 2544/java
 tcp6 0 0 :::9300 :::* LISTEN 2544/java

Access 9200 via browser

At the same time, elastic provides an API that can view the current number of hosts

[root@zabbix ~]# curl 192.168.121.21:9200/_cat/nodes
 172.16.1.21 40 97 0 0.11 0.29 0.20 cdfhilmrstw * elk
 # Access on the command line, since the current single node deployment es, there is only one node

 # Before we start es is the front desk startup, there will be two problems.
 1. Occupy terminal
 2. It is difficult to end es, so we usually start it by running in the background.
 The official way to run the background
 -d parameter of elasticsearch
 To run Elasticsearch as a daemon, specify -d on the command line, and record the process ID in a file using the -p option:
 ./bin/elasticsearch -d -p pid
 root@elk:~# su - elastic -c '/usr/local/elasticsearch-7.17.28/bin/elasticsearch -d'

 # Common errors
 Q1: The maximum virtual memory map is too small
 bootstrap check failure [1] of [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
 ERROR: Elasticsearch did not exit normally - check the logs at /var/log/es7/
 root@elk:~# sysctl -q vm.max_map_count
 vm.max_map_count = 65530
 root@elk:~# echo "vm.max_map_count = 262144" >> /etc//
 root@elk:~# sysctl -w vm.max_map_count=262144
 vm.max_map_count = 262144
 root@elk:~# sysctl -q vm.max_map_count
 vm.max_map_count = 262144


 Q2: es configuration file is written incorrectly
 : single-node


 Q3: The word lock appears to indicate that an ES instance has been started.  Kill the existing process and then restart the startup command
 : failed to obtain node locks, tried [[/var/lib/es7]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?

 Q5: There is a problem with the deployment of ES cluster and the master role is missing.
 {"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

Uninstall the environment

1. Stop elasticsearch
 root@elk:~# kill `ps -ef | grep java | grep -v grep |awk '{print $2}'`
 root@elk:~# ps -ef | grep java
 root 4437 1435 0 09:21 pts/2 00:00:00 grep --color=auto java

 2. Delete data directory, log directory, installation package, user
 root@elk:~# rm -rf /usr/local/elasticsearch-7.17.28/ /var/{lib,log}/es7/
 root@elk:~# userdel -r elastic

Install ES single point based on deb package

1. Install the deb package
 root@elk:~# wget /downloads/elasticsearch/elasticsearch-7.17.

 2. Install es
 root@elk:~# dpkg -i elasticsearch-7.17.
 # Install es through binary packages can be managed using systemctl
 ### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd
  sudo systemctl daemon-reload
  sudo systemctl enable
 ### You can start elasticsearch service by executing
  sudo systemctl start
 Created elasticsearch keystore in /etc/elasticsearch/


 3. Modify the es configuration file
 root@elk:~# vim /etc/elasticsearch/
 root@elk:~# egrep -v "^#|^$" /etc/elasticsearch/
 : xu-es
 : /var/lib/elasticsearch
 : /var/log/elasticsearch
 : 0.0.0.0
 : single-node

 4. Start es
 systemctl enable elasticsearch --now
 # Check the service file of es. The following parameters are all done by ourselves during binary installation.
 User=elasticsearch
 Group=elasticsearch
 ExecStart=/usr/share/elasticsearch/bin/systemd-entrypoint -p ${PID_DIR}/ --quie

 cat /usr/share/elasticsearch/bin/systemd-entrypoint
 #!/bin/sh

 # This wrapper script allows SystemD to feed a file containing a passphrase into
 # the main Elasticsearch startup script

 if [ -n "$ES_KEYSTORE_PASSPHRASE_FILE" ] ; then
   exec /usr/share/elasticsearch/bin/elasticsearch "$@" < "$ES_KEYSTORE_PASSPHRASE_FILE"
 else
   exec /usr/share/elasticsearch/bin/elasticsearch "$@"
 fi

Common Terms

1. Index Index
	 The unit for users to read and write data
 2. Sharding Shared
	 An index must have at least one shard. If an index has only one shard, it means that the data of the index can only be stored in a certain node in full, and the shards are not split and belong to a certain node.
	 In other words, sharding is the smallest scheduling unit in the ES cluster.
	 An index data can also be stored on different shards in a dispersed manner, and these shards can be placed on different nodes to realize distributed storage of data.
 3. Copy replica
	 Replicas are for shards, and a shard can have 0 or more replicas.
	 When the number of replicas is 0, it means that there is only the primary shard. When the node where the primary shard is located is down, the data will be inaccessible.
	 When the number of replicas is greater than 0, it means that there are both primary shards and replica shards:
		 The main shard is responsible for reading and writing data (read write, rw)
		 Replica shards are responsible for load balancing of data reads (read only,ro)
 4. Document document
	 Refers to the data stored by the user.  It contains metadata and source data.
	 Metadata:
		 Data used to describe the source data.
	 Source data:
		 Data actually stored by the user.
 5. Allocation: allocation
	 It refers to the process of allocating different shards of the index (including primary shards and replica shards) to the entire cluster.

Check cluster status

#es provides api /_cat/health
 root@elk:~# curl 127.1:9200/_cat/health
 1742210504 11:21:44 xu-es green 1 1 3 3 0 0 0 0 - 100.0%
 root@elk:~# curl 127.1:9200/_cat/health?v
 epoch timestamp cluster status shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
 1742210512 11:21:52 xu-es green 1 1 3 3 0 0 0 0 - 100.0%

escrow environment deployment

1. Install es cluster service
 root@elk1:~# wget /downloads/elasticsearch/elasticsearch-7.17.
 root@elk1:~# dpkg -i elasticsearch-7.17.
 root@elk2:~# dpkg -i elasticsearch-7.17.
 root@elk3:~# dpkg -i elasticsearch-7.17.


 2. Configuration es, three machines are the same configuration
 # No configuration is required
 [root@elk1 ~]# grep -E "^(cluster|path|network|discovery|http)" /etc/elasticsearch/
 : es-cluster
 : /var/lib/elasticsearch
 : /var/log/elasticsearch
 : 0.0.0.0
 : 9200
 discovery.seed_hosts: ["192.168.121.91", "192.168.121.92", "192.168.121.93"]


 3. Start the service
 systemctl enable elasticsearch --now

 4. Test, the master node with *
 root@elk:~# curl 127.1:9200/_cat/nodes
 172.16.1.23 6 97 25 0.63 0.57 0.25 cdfhilmrstw - elk3
 172.16.1.22 5 96 23 0.91 0.76 0.33 cdfhilmrstw - elk2
 172.16.1.21 19 90 39 1.22 0.87 0.35 cdfhilmrstw * elk
 root@elk:~# curl 127.1:9200/_cat/nodes?v
 ip cpu load_1m load_5m load_15m master name
 172.16.1.23 9 83 2 0.12 0.21 0.18 cdfhilmrstw - elk3
 172.16.1.22 8 96 3 0.16 0.28 0.24 cdfhilmrstw - elk2
 172.16.1.21 22 97 3 0.09 0.30 0.25 cdfhilmrstw * elk

 # Cluster deployment failure No uuid cluster missing master

 [root@elk3 ~]# curl http://192.168.121.92:9200/_cat/nodes?v
 {"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}
 [root@elk3 ~]# curl 192.168.121.91:9200
 {
   "name" : "elk91",
   "cluster_name" : "es-cluster",
   "cluster_uuid" : "_na_",
   ...
 }
 [root@elk3 ~]#
 [root@elk3 ~]# curl 10.0.0.92:9200
 {
   "name" : "elk92",
   "cluster_name" : "es-cluster",
   "cluster_uuid" : "_na_",
   ...
 }
 [root@elk3 ~]#
 [root@elk3 ~]#
 [root@elk3 ~]# curl 10.0.0.93:9200
 {
   "name" : "elk93",
   "cluster_name" : "es-cluster",
   "cluster_uuid" : "_na_",
   ...
 }
 [root@elk3 ~]#
 [root@elk3 ~]# curl http://192.168.121.91:9200/_cat/nodes
 {"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}


 # Solution
 1. Stop the ES service of the cluster
 [root@elk91 ~]# systemctl stop
 [root@elk92 ~]# systemctl stop
 [root@elk93 ~]# systemctl stop


 2. Delete data, logs, and temporary data
 [root@elk91 ~]# rm -rf /var/{lib,log}/elasticsearch/* /tmp/*
 [root@elk92 ~]# rm -rf /var/{lib,log}/elasticsearch/* /tmp/*
 [root@elk93 ~]# rm -rf /var/{lib,log}/elasticsearch/* /tmp/*

 3. Add configuration items
 [root@elk1 ~]# grep -E "^(cluster|path|network|discovery|http)" /etc/elasticsearch/
 : es-cluster
 : /var/lib/elasticsearch
 : /var/log/elasticsearch
 : 0.0.0.0
 : 9200
 discovery.seed_hosts: ["192.168.121.91", "192.168.121.92", "192.168.121.93"]
 cluster.initial_master_nodes: ["192.168.121.91", "192.168.121.92", "192.168.121.93"] ######

 4. Restart the service

 5. Test



 es cluster master election process
 1. When starting, it will check whether the cluster has a master, and if so, no election master will be initiated;
 1. When the start is started, all nodes are masters and send information to other nodes in the cluster (including ClusterStateVersion, ID, etc.)
 2. Obtain a list of all nodes that can participate in master elections based on similar gossip protocols;
 3. First compare "ClusterStateVersion". Whoever has the highest priority will be elected as a master;
 4. If you can't compare it, compare the ID. Whoever has a smaller ID will be preferred to become a master;
 5. When more than half of the nodes in the cluster participate in the election, the master election will be completed. For example, there are N nodes, and only the "(N/2)+1" node is needed to confirm the master;
 After the election is completed, the cluster list will be notified of the latest master node, which means that the election is completed;

DSL

  • ES compared to MySQL
    • MySQL is a relational database:
      Add, delete, modify and search: Based on SQL
    • ES is a document-based database and is very similar to MangoDB.
      Add, delete, modify and search: DSL statements are a unique query language for ES.
      For fuzzy queries, mysql cannot make full use of indexes and has low performance. Instead, it uses ES to query fuzzy data, which is very efficient.

Add a single request to es

Testing with postman

# In essence, curl is used
 curl --location 'http://192.168.121.21:9200/test_linux/doc' \
 --header 'Content-Type: application/json' \
 --data '{
     "name": "Sun Wukong",
     "hobby": [
         "Peach",
         "Zixia Fairy"
     ]
 }



 curl --location '192.168.121.21:9200/_bulk' \
 --header 'Content-Type: application/json' \
 --data '{ "create" : { "_index" : "test_linux_ss", "_id" : "1001" } }
 { "name" : "Zhu Bajie","hobby": ["Monkey Brother","Gao Laozhuang"] }


 {"create": {"_index":"test_linux_ss","_id":"1002"}}
 {"name":"White Dragon Horse","hobby":["Carry Tang Monk","Eat Grass"]}
 '

Query data

curl --location '192.168.121.22:9200/test_linux_ss/_doc/1001' \
 --data ''



 curl --location --request GET '192.168.121.22:9200/test_linux_ss/_search' \
 --header 'Content-Type: application/json' \
 --data '{
     "query":{
         "match":{
             "name":"Zhu Bajie"
         }
     }
 }'

Delete data

curl --location --request DELETE '192.168.121.22:9200/test_linux_ss/_doc/1001'

kibana

Deploy kibana

Kibana is a visualization tool for ES. Future operations can be completed in ES.

1. Download kibana
 root@elk:~# wget /downloads/kibana/kibana-7.17.

 2. Install kibana
 root@elk:~# dpkg -i kibana-7.17.

 3. Modify the configuration file
 root@elk:~# vim /etc/kibana/
 root@elk:~# grep -E "^(|i18n|server)" /etc/kibana/
 : 5601
 : "0.0.0.0"
 : ["http://192.168.121.21:9200","http://192.168.121.22:9200","http://192.168.121.23:9200"]
 : "zh-CN"

 4. Start kibana
 root@elk:~# systemctl enable --now
 Synchronizing state of with SysV service script with /lib/systemd/systemd-sysv-install.
 Executing: /lib/systemd/systemd-sysv-install enable kibana
 Created symlink /etc/systemd/system// → /etc/systemd/system/.
 root@elk:~# netstat -tunlp | grep 5601
 tcp 0 0 0.0.0.0:5601 0.0.0.0:* LISTEN 19392/node

Web access testing

Based on KQL basic usage

Filter data

Filebeat

Deploy Filebeat

1. Download Filebeat
 root@elk2:~# wget /downloads/beats/filebeat/filebeat-7.17.

 2. Install Filebeat
 root@elk2:~# dpkg -i filebeat-7.17.

 3. Write Filebeat configuration file
 # Filebeat requires us to create the config directory ourselves and then write the configuration file
 mkdir /etc/filebeat/config
 vim /etc/filebeat/config/
 # Filebeat configuration file contains two parts, Input and Output, Input indicates where to collect data; Output indicates where to store the data and configure it according to the official document.
 # There is currently no service running log, so the Input source is specified as a special file, and Output goes to the output terminal console

 root@elk2:~# cat /etc/filebeat/config/
 # Define where the data comes from
 :
   # The type of the data source is log, which means reading data from the file
 - type: log
   # Specify the path to the file
   paths:
     - /tmp/

 # Define data to terminal
 :
   pretty: true

 4. Run filebeat instance
 filebeat -e -c /etc/filebeat/config/

 5. Create a file and write data
 root@elk2:~# echo ABC > /tmp/

 // Output prompt
 {
   "@timestamp": "2025-03-18T14:48:42.432Z",
   "@metadata": {
     "beat": "filebeat",
     "type": "_doc",
     "version": "7.17.28"
   },
   "message": "ABC", // The detected content changes
   "input": {
     "type": "log"
   },
   "ecs": {
     "version": "1.12.0"
   },
   "host": {
     "name": "elk2"
   },
   "agent": {
     "type": "filebeat",
     "version": "7.17.28",
     "hostname": "elk2",
     "ephemeral_id": "7f116862-382c-48f4-8797-c4b689e6e6fe",
     "id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf",
     "name": "elk2"
   },
   "log": {
     "offset": 0, // Offset offset=0 means starting from 0
     "file": {
       "path": "/tmp/"
     }
   }
 }


 # Append write data to the file
 root@elk2:~# echo 123 >> /tmp/

 // Check filebeat output prompts
 {
   "@timestamp": "2025-03-18T14:51:17.449Z",
   "@metadata": {
     "beat": "filebeat",
     "type": "_doc",
     "version": "7.17.28"
   },
   "log": {
     "offset": 4, // The offset starts from 4
     "file": {
       "path": "/tmp/"
     }
   },
   "message": "123", // Statistics 123
   "input": {
     "type": "log"
   },
   "ecs": {
     "version": "1.12.0"
   },
   "host": {
     "name": "elk2"
   },
   "agent": {
     "id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf",
     "name": "elk2",
     "type": "filebeat",
     "version": "7.17.28",
     "hostname": "elk2",
     "ephemeral_id": "7f116862-382c-48f4-8797-c4b689e6e6fe"
   }
 }

Filebeat Features

# Use the silent output of echo to append # At this time, filebeat cannot collect data
 root@elk2:~# echo -n 456 >> /tmp/
 root@elk2:~# cat /tmp/
 ABC
 123
 456root@elk2:~#
 root@elk2:~# echo -n abc >> /tmp/
 root@elk2:~# cat /tmp/
 ABC
 123
 456789abcroot@elk2:~#

 # Use non-silent writing data at this time filebeat can collect data
 root@elk2:~# echo haha ​​>> /tmp/
 root@elk2:~# cat /tmp/
 ABC
 123
 456789abchaha


 // View filebeat output information
 {
   "@timestamp": "2025-03-18T14:55:37.476Z",
   "@metadata": {
     "beat": "filebeat",
     "type": "_doc",
     "version": "7.17.28"
   },
   "host": {
     "name": "elk2"
   },
   "agent": {
     "name": "elk2",
     "type": "filebeat",
     "version": "7.17.28",
     "hostname": "elk2",
     "ephemeral_id": "7f116862-382c-48f4-8797-c4b689e6e6fe",
     "id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf"
   },
   "log": {
     "offset": 8, // Offset=8
     "file": {
       "path": "/tmp/"
     }
   },
   "message": "456789abchaha", // The collected data is all data with silent output + non-silent output
   "input": {
     "type": "log"
   },
   "ecs": {
     "version": "1.12.0"
   }
 }

From this we can obtain the first property of Filebeat:

filebeat is to collect data by row by row by default;

# Now stop Filebeat and modify it. If you start data collection again, will Filebeat collect all the contents in the file, or will it only collect the newly added contents after Filebeat stops
 root@elk2:~# echo xixi >> /tmp/

 # Restart Filebeat

 // View Filebeat output information
 {
   "@timestamp": "2025-03-18T15:00:51.759Z",
   "@metadata": {
     "beat": "filebeat",
     "type": "_doc",
     "version": "7.17.28"
   },
   "ecs": {
     "version": "1.12.0"
   },
   "host": {
     "name": "elk2"
   },
   "agent": {
     "type": "filebeat",
     "version": "7.17.28",
     "hostname": "elk2",
     "ephemeral_id": "81db6575-7f98-4ca4-a86f-4d0127c1e2a4",
     "id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf",
     "name": "elk2"
   },
   "log": {
     "offset": 22, // The offset does not start from 0 either
     "file": {
       "path": "/tmp/"
     }
   },
   "message": "xixi", // Only collect new content after Filebeat is stopped
   "input": {
     "type": "log"
   }
 }


 // There is a prompt above, when filebeat is started again, import the json file in /var/lib/filebeat/registry/filebeat directory
 2025-03-18T15:00:51.756Z INFO memlog/:124 Finished loading transaction log file for '/var/lib/filebeat/registry/filebeat'. Active transaction id=5
 // We will also import from the /var/lib/filebeat/registry/filebeat directory when we first start, but this directory will not exist when we start for the first time, so naturally there is no information.

 // View the json file under /var/lib/filebeat/registry/filebeat, and record the offset value in this json file. This is why filebeat can restart and not start recording the file content from scratch after it is stopped.
 {"op":"set","id":1}
 {"k":"filebeat::logs::native::1441831-64768","v":{"id":"native::1441831-64768","prev_id":"","timestamp":[431172511,1742309322],"ttl":-1,"identifier_name":"native","source":"/tmp/","offset":0,"type":"log","FileStateOS":{"inode":1441831,"device":64768}}}
 {"op":"set","id":2}
 {"k":"filebeat::logs::native::1441831-64768","v":{"prev_id":"","source":"/tmp/","type":"log","FileStateOS":{"inode":1441831,"device":64768},"id":"native::1441831-64768","offset":4,"timestamp":[434614328,1742309323],"ttl":-1,"identifier_name":"native"}}
 {"op":"set","id":3}
 {"k":"filebeat::logs::native::1441831-64768","v":{"id":"native::1441831-64768","identifier_name":"native","ttl":-1,"type":"log","FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","source":"/tmp/","offset":8,"timestamp":[450912955,1742309478]}}
 {"op":"set","id":4}
 {"k":"filebeat::logs::native::1441831-64768","v":{"type":"log","identifier_name":"native","offset":22,"timestamp":[478003874,1742309738],"source":"/tmp/","ttl":-1,"FileStateOS":{"inode":1441831,"device":64768},"id":"native::1441831-64768","prev_id":""}}
 {"op":"set","id":5}
 {"k":"filebeat::logs::native::1441831-64768","v":{"id":"native::1441831-64768","ttl":-1,"FileStateOS":{"device":64768,"inode":1441831},"identifier_name":"native","prev_id":"","source":"/tmp/","offset":22,"timestamp":[478003874,1742309738],"type":"log"}}
 {"op":"set","id":6}
 {"k":"filebeat::logs::native::1441831-64768","v":{"offset":22,"timestamp":[759162512,1742310051],"type":"log","FileStateOS":{"device":64768,"inode":1441831},"id":"native::1441831-64768","prev_id":"","identifier_name":"native","source":"/tmp/","ttl":-1}}
 {"op":"set","id":7}
 {"k":"filebeat::logs::native::1441831-64768","v":{"offset":22,"timestamp":[759368397,1742310051],"type":"log","FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","source":"/tmp/","ttl":-1,"identifier_name":"native","id":"native::1441831-64768"}}
 {"op":"set","id":8}
 {"k":"filebeat::logs::native::1441831-64768","v":{"ttl":-1,"identifier_name":"native","id":"native::1441831-64768","source":"/tmp/","timestamp":[761513338,1742310052],"FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","offset":27,"type":"log"}}
 {"op":"set","id":9}
 {"k":"filebeat::logs::native::1441831-64768","v":{"source":"/tmp/","timestamp":[795028411,1742310356],"FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","offset":27,"ttl":-1,"type":"log","identifier_name":"native","id":"native::1441831-64768"}}

This is the second feature of Filebeat

filebeat will record the collected file offset information in the "/var/lib/filebeat" directory by default, so that data will continue to be collected at this location after the next acquisition;

Filebeat writes to es

Write Filebeat's Output to es
 View configuration according to official documentation
 The Elasticsearch output sends events directly to Elasticsearch using the Elasticsearch HTTP API.

 Example configuration:

 :
   hosts: ["https://myEShost:9200"]

 root@elk2:~# cat /etc/filebeat/config/
 # Define where the data comes from
 :
   # The type of the data source is log, which means reading data from the file
 - type: log
   # Specify the path to the file
   paths:
     - /tmp/

 # Define data to terminal
 :
   hosts:
     - 192.168.121.21:9200
     - 192.168.121.22:9200
     - 192.168.121.23:9200
 # Delete filebeat's json file
 root@elk2:~# rm -rf /var/lib/filebeat

 # Start filebeat instance
 root@elk2:~# filebeat -e -c /etc/filebeat/config/

Data collected in kibana

View collected data

Set refresh frequency

Custom index

# We can define the index name yourself. The official definition method is given to set the index parameter
 :
   hosts: ["http://localhost:9200"]
   index: "%{[fields.log_type]}-%{[]}-%{+}"
  
 root@elk2:~# cat /etc/filebeat/config/
 # Define where the data comes from
 :
   # The type of the data source is log, which means reading data from the file
 - type: log
   # Specify the path to the file
   paths:
     - /tmp/

 # Define data to terminal
 :
   hosts:
     - 192.168.121.21:9200
     - 192.168.121.22:9200
     - 192.168.121.23:9200
   # Custom index name
   index: "test_filebeat-%{+}"

 # Start filebeat, an error will be reported at this time
 root@elk2:~# filebeat -e -c /etc/filebeat/config/
 2025-03-19T02:55:18.951Z INFO instance/:698 Home path: [/usr/share/filebeat] Config path: [/etc/filebeat] Data path: [/var/lib/filebeat] Logs path: [/var/log/filebeat] Hostfs Path: [/]
 2025-03-19T02:55:18.958Z INFO instance/:706 Beat ID: a109c2d1-fbb6-4b82-9416-29f9488ccabc
 # You must set these two parameters and, that is, if we want to customize the index name, we must set and
 2025-03-19T02:55:18.958Z ERROR instance/:1027 Exiting: and have to be set if index name is modified
 Exiting: and have to be set if index name is modified

 # and gave prompts on the official website,
  If you change this setting, you also need to configure the and options (see Elasticsearch index template).

 # Official example

	 The name of the template. The default is filebeat. The Filebeat version is always appended to the given name, so the final name is filebeat-%{[]}.
	

	 The template pattern to apply to the default index settings. The default pattern is filebeat. The Filebeat version is always included in the pattern, so the final pattern is filebeat-%{[]}.

 Example:

 : "filebeat"
 : "filebeat"

 # You also need to set the default settings given by shared and replicas
 :
   index.number_of_shards: 1
   index.number_of_replicas: 1

 # Configure our own index template (that is, the rules for creating indexes)
 root@elk2:~# cat /etc/filebeat/config/
 # Define where the data comes from
 :
   # The type of the data source is log, which means reading data from the file
 - type: log
   # Specify the path to the file
   paths:
     - /tmp/

 # Define data to terminal
 :
   hosts:
     - 192.168.121.21:9200
     - 192.168.121.22:9200
     - 192.168.121.23:9200
   # Custom index name
   index: "test_filebeat-%{+}"

 # Define the name of the index template (that is, the rule for creating the index)
 : "test_filebeat"
 # Define the matching pattern of the index template to indicate which indexes the current index template takes effect.
 : "test_filebeat-*"
 # # Define the rule information for index templates
 :
   # Number of shards
   index.number_of_shards: 3
   # How many copies are there for each shard
   index.number_of_replicas: 0


 # Start filebeat You can start filebeat normally at this time, but in kibana, we found that we did not set up the index, check the startup information
 root@elk2:~# filebeat -e -c /etc/filebeat/config/
 # The rough meaning here is that ILM is set to auto. If this configuration is enabled, all information about the custom index is ignored. So we need to set ILM to false
 2025-03-19T03:10:02.548Z INFO [index-management] idxmgmt/:260 Auto ILM enable success.
 2025-03-19T03:10:02.558Z INFO [] ilm/:170 ILM policy filebeat exists already.
 2025-03-19T03:10:02.559Z INFO [index-management] idxmgmt/:396 Set to '{filebeat-7.17.28 {now/d}-000001}' as ILM is enabled.

 # Check the official website for index lifecycle management ILM configuration
 When index lifecycle management (ILM) is enabled, the default index is "filebeat-%{[]}-%{+}-%{index_num}", for example, "filebeat-8.17.3-2025-03-17-000001". Custom index settings are ignored when ILM is enabled. If you’re sending events to a cluster that supports index lifecycle management, see Index lifecycle management (ILM) to learn how to change the index name.

 # ilm is auto mode by default, supports true, false, and auto
 Enables or disables index lifecycle management on any new indices created by Filebeat. Valid values ​​are true, false, and auto. When auto (the default) is specified on version 7.0 and later
 : auto

 # Add ilm configuration to our own configuration file

 # Start filebeat
 root@elk2:~# filebeat -e -c /etc/filebeat/config/

The index template has been created

# At this time I want to modify my shared and replicas
 # Modify the configuration file directly and change to 5shared 0repicas

It's still 3shared 0replicas

# This is because the default parameter is false, which means it does not overwrite

 A boolean that specifies whether to overwrite the existing template. The default is false. Do not enable this option if you start more than one instance of Filebeat at the same time. It can overload Elasticsearch by sending too many template update requests.


 # Set to true
 root@elk2:~# cat /etc/filebeat/config/
 # Define where the data comes from
 :
   # The type of the data source is log, which means reading data from the file
 - type: log
   # Specify the path to the file
   paths:
     - /tmp/

 # Define data to terminal
 :
   hosts:
     - 192.168.121.21:9200
     - 192.168.121.22:9200
     - 192.168.121.23:9200
   # Custom index name
   index: "test_filebeat-%{+}"

 # Disable index lifecycle management (ILM)
 # If this configuration is enabled, all information about the custom index is ignored
 : false
 # If the index template exists, whether to overwrite it, the default value is false, and if it is explicitly needed, it can be set to ture.
 # However, the official suggests setting it to false, because every time you write data, you will create a tcp link, consuming resources.
 : true
 # Define the name of the index template (that is, the rule for creating the index)
 : "test_filebeat"
 # Define the matching pattern of the index template to indicate which indexes the current index template takes effect.
 : "test_filebeat-*"
 # # Define the rule information for index templates
 :
   # Number of shards
   index.number_of_shards: 5
   # How many copies are there for each shard
   index.number_of_replicas: 0

 # Start filebeat
 # At this time, the shared and replicas have been modified.

Filebeat collection nginx practice

1. Install nginx
 root@elk2:~# apt install -y nginx

 2. Start nginx
 root@elk2:~# systemctl start nginx
 root@elk2:~# netstat -tunlp | grep 80
 tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 17956/nginx: master
 tcp6 0 0 :::80 :::* LISTEN 17956/nginx: master

 3. Test access
 root@elk2:~# curl 127.1
 #Log Location
 root@elk2:~# ll /var/log/nginx/
 -rw-r----- 1 www-data adm 86 Mar 19 06:58 /var/log/nginx/
 root@elk2:~# cat /var/log/nginx/
 127.0.0.1 - - [19/Mar/2025:06:58:31 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0"


 4. Write Filebeat instance
 root@elk2:~# cat /etc/filebeat/config/
 # Define where the data comes from
 :
   # The type of the data source is log, which means reading data from the file
 - type: log
   # Specify the path to the file
   paths:
     - /var/log/nginx/*

 # Define data to terminal
 :
   hosts:
     - 192.168.121.21:9200
     - 192.168.121.22:9200
     - 192.168.121.23:9200
   # Custom index name
   index: "test_filebeat-%{+}"

 # Disable index lifecycle management (ILM)
 # If this configuration is enabled, all information about the custom index is ignored
 : false
 # If the index template exists, whether to overwrite it, the default value is false, and if it is explicitly needed, it can be set to ture.
 # However, the official suggests setting it to false, because every time you write data, you will create a tcp link, consuming resources.
 : true
 # Define the name of the index template (that is, the rule for creating the index)
 : "test_filebeat"
 # Define the matching pattern of the index template to indicate which indexes the current index template takes effect.
 : "test_filebeat-*"
 # # Define the rule information for index templates
 :
   # Number of shards
   index.number_of_shards: 5
   # How many copies are there for each shard
   index.number_of_replicas: 0


 5. Start filebeat
 root@elk2:~# filebeat -e -c /etc/filebeat/config/

Filebeat analysis nginx log

filebeat modules

# filebeat support any modules
 # Official explanation of filebeat module: It can simplify the log format of filebeat analysis
 # Filebeat modules simplify the collection, parsing, and visualization of common log formats.
 # By default, these modules are disabled. When needed, we need to set them to enable ourselves.
 root@elk2:~# ls -l /etc/filebeat//
 Total 300
 -rw-r--r-- 1 root root 484 Feb 13 16:58
 -rw-r--r-- 1 root root 476 Feb 13 16:58
 -rw-r--r-- 1 root root 281 Feb 13 16:58
 -rw-r--r-- 1 root root 2112 Feb 13 16:58
 .  .  .
 root@elk2:~# ls -l /etc/filebeat// | wc -l
 72

 # Check which modules are enabled and disabled
 root@elk2:~# filebeat modules list


 # Start the module
 root@elk2:~# filebeat modules enable apache nginx mysql redis
 Enabled apache
 Enabled nginx
 Enabled mysql
 Enabled redis

 # Stop the module
 root@elk2:~# filebeat modules disable apache mysql redis
 Disabled apache
 Disabled mysql
 Disabled redis

Configure filebeat monitoring nginx

# You need to configure module functions in filebeat configuration file. The configuration method is specified in /etc/filebeat/file
 :
   # Glob pattern for configuration loading
   path: ${}//*.yml

   # Set to true to enable config reloading
   : false

   # Period on which files under path should be checked for changes
   #: 10s

 module_nginx# Write filebeat instance
 root@elk2:~# cat /etc/filebeat/config/
 # Config modules
 :
   # Glob pattern for configuration loading Specify the path to load
   path: ${}//*.yml

   # Set to true to enable config reloading Automatically load the yml file below /etc/filebeat//.
   : true

   # Period on which files under path should be checked for changes
   #: 10s

 :
   hosts:
     - 192.168.121.21:9200
     - 192.168.121.22:9200
     - 192.168.121.23:9200
   index: "module_nginx-%{+}"

 : false
 : true
 : "module_nginx"
 : "module_nginx-*"
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0


 # Prepare nginx log test cases
 root@elk2:~# cat /var/log/nginx/
 192.168.121.1 - - [19/Mar/2025:16:42:23 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36 CrKey/1.54.248666"
 1.168.121.1 - - [19/Mar/2025:16:42:26 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36 CrKey/1.54.248666"
 92.168.121.1 - - [19/Mar/2025:16:42:29 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1"
 192.168.11.1 - - [19/Mar/2025:16:42:31 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android 13; SM-G981B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Mobile Safari/537.36"
 192.168.121.1 - - [19/Mar/2025:16:42:40 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Safari/605.1.15"



 # Start filebeat instance
 root@elk2:~# filebeat -e -c /etc/filebeat/config/





 # Include the correct logs and wrong logs in the collected results. If we want to collect only the correct ones, we need to set the configuration file of the nginx template /etc/filebeat//
 - module: nginx
   # Access logs
   access:
     enabled: true

     # Set custom paths for the log files. If left empty,
     # Filebeat will choose the paths depending on your OS.
     : ["/var/log/nginx/"]

   # Error logs
   error:
     enabled: false # Change from true to false

     # Set custom paths for the log files. If left empty,
     # Filebeat will choose the paths depending on your OS.
     #:

   # Ingress-nginx controller logs. This is disabled by default. It could be used in Kubernetes environments to parse ingress-nginx logs
   ingress_controller:
     enabled: false

     # Set custom paths for the log files. If left empty,
     # Filebeat will choose the paths depending on your OS.
     #:

Kibana analysis PV

pv: page view page visits
 1 request is a pv

Kibana Analytics IP

Kibana analyzes bandwidth

Kibana makes Dashboard

Kibana analysis equipment

Kibana analyzes the operating system proportion

Kibana analyzes the proportion of global users

filebeat collection tomcat logs

Deploy tomcat

[root@elk2 ~]# wget /tomcat/tomcat-11/v11.0.5/bin/apache-tomcat-11.0.
 [root@elk2 ~]# tar xf apache-tomcat-11.0. -C /usr/local

 # Configure environment variables
 # es itself has a jdk environment. We configure es' jdk environment into environment variables and let tomcat call es' jdk environment
 # es' jdk environment directory
 [root@elk2 ~]# ll /usr/share/elasticsearch/jdk/
 # Add environment variables
 [root@elk2 ~]# vim /etc//
 [root@elk2 ~]# source /etc//
 [root@elk2 ~]# cat /etc//
 #!/bin/bash
 export JAVA_HOME=/usr/share/elasticsearch/jdk
 export TOMCAT_HOME=/usr/local/apache-tomcat-11.0.5
 export PATH=$PATH:$JAVA_HOME/bin:$TOMCAT_HOME/bin
 [root@elk3 ~]# java -version
 openjdk version "22.0.2" 2024-07-16
 OpenJDK Runtime Environment (build 22.0.2+9-70)
 OpenJDK 64-Bit Server VM (build 22.0.2+9-70, mixed mode, sharing)

 # Because the default log format of tomcat shows very little information, we need to modify the tomcat configuration file and modify the log format.
 [root@elk3 ~]# vim /usr/local/apache-tomcat-11.0.5/conf/
 ...
           <Host name="" appBase="webapps"
                 unpackWARs="true" autoDeploy="true">

		 <Valve className="" directory="logs"
             prefix=".com_access_log" suffix=".json"
 pattern="{"clientip":"%h","ClientUser":"%l","authenticated":"%u","AccessTime":"%t","request":"%r","s  tattoos&quot;:%s&quot;,&quot;SendBytes&quot;:&quot;%b&quot;,&quot;Query?string&quot;:&quot;%q&quot;,&quot;partner&quot;:&quot;%{Referer}i&quot;,&quot;http_user_agent&quot;:&quot;%{User-Agent}i&quot;}"/>

           </Host>



 # Start tomcat
 [root@elk2 ~]# start
 [root@elk2 ~]# netstat -tunlp | grep 8080
 tcp6 0 0 :::8080 :::* LISTEN 98628/java


 # Access Test
 [root@elk2 ~]# cat /etc/hosts
 127.0.0.1 localhost

 # The following lines are desirable for IPv6 capable hosts
 ::1 ip6-localhost ip6-loopback
 fe00::0 ip6-localnet
 ff00::0 ip6-mcastprefix
 ff02::1 ip6-allnodes
 ff02::2 ip6-allrouters


 192.168.121.92

 [root@elk2 ~]# cat /usr/local/apache-tomcat-11.0.5/logs/.com_access_log.
 {"clientip":"192.168.121.92","ClientUser":"-","authenticated":"-","AccessTime":"[23/Mar/2025:20:55:41 +0800]","request":"GET / HTTP/1.1","status":"200","SendBytes":"11235","Query?string":"","partner":"-","http_user_agent":"curl/7.81.0"}

Configure filebeat monitoring tomcat

# Start the tomcat module
 [root@elk3 ~]# filebeat modules enable tomcat
 Enabled tomcat
 [root@elk3 ~]# ll /etc/filebeat//
 -rw-r--r-- 1 root root 623 Feb 14 00:58 /etc/filebeat//



 # Configure the tomcat module
 [root@elk3 ~]# cat /etc/filebeat//
 # Module: tomcat
 # Docs: /guide/en/beats/filebeat/7.17/

 - module: tomcat
   log:
     enabled: true

     # Set which input to use between udp (default), tcp or file.
     #: udp
     : file
     # var.syslog_host:
     # var.syslog_port: 8080

     # Set paths for the log files when file input is used.
     #:
     # - /var/log/tomcat/*.log
     :
       - /usr/local/apache-tomcat-11.0.5/logs/.com_access_log.

     # Toggle output of non-ECS fields (default true).
     # var.rsa_fields: true

     # Set custom timezone offset.
     # "local" (default) for system timezone.
     # "+02:00" for GMT+02:00
     # var.tz_offset: local

 # Configure filebeat
 [root@elk3 ~]# cat /etc/filebeat/config/
 :
   path: ${}//*.yml
   : true
 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: test-modules-tomcat-%{+}
 : false
 : "test-modules-tomcat"
 : "test-modules-tomcat-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0


 # Start filebeat
 [root@elk3 ~]# filebeat -e -c /etc/filebeat/config/

filebeat processors

filebeat processor

/guide/en/beats/filebeat/7.17/

# When we use filebeat to collect tomcat logs, since tomcat logs are in json format under our settings, we want to obtain specific information in json format, and we need to further configure them through filebeat processors
 # decode JSON fields parameter can implement parsing json format
 # Official configuration
 processors:
   - decode_json_fields:
       fields: ["field1", "field2", ...]
       process_array: false
       max_depth: 1
       target: ""
       overwrite_keys: false
       add_error_key: true
 # fields: Specify which field to perform json parsing
 # process_array: a bool value that specifies whether to parse numbers. The default is false, optional configuration
 # max_depth: Maximum parsing depth, default value is 1 Decodes the JSON object fields in the fields shown in the fields shown in the decode, and a value of 2 also decodes the objects embedded in the fields of these parsed documents, optional configuration
 # target: The field to which the decoded JSON will be written.  By default, the decoded JSON object will replace the string field that reads it.  To merge the decoded JSON fields into the root of the event, specify target an empty string ( target: "").  Note that the null value ( target: ) is considered as unset fields optionally configured
 # overwrite_keys: Boolean, specifying whether the existing key in the event is overwritten by the key in the decoded JSON object.  The default value is false.  Optional configuration
 # add_error_key: If set to true and an error occurs when decoding the JSON key, the error field will become part of the event with an error message.  If set to false, there will be no errors in the field of the event.  The default value is false.  Optional configuration

 # Write filebeat configuration
 [root@elk3 ~]# cat /etc/filebeat/config/,
 cat: /etc/filebeat/config/,: No such file or directory
 [root@elk3 ~]# cat /etc/filebeat/config/
 :
   path: ${}//*.yml
   : true
 processors:
   - decode_json_fields:
       fields: [""]
       process_array: false
       max_depth: 1
       target: ""
       overwrite_keys: false
       add_error_key: true
 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: test-modules-tomcat-%{+}
 : false
 : "test-modules-tomcat"
 : "test-modules-tomcat-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0

 # Start filebeat
 [root@elk3 ~]# filebeat -e -c /etc/filebeat/config/



 # Delete a field through filebeat
 processors:
   - drop_fields:
       When:
         condition
       fields: ["field1", "field2", ...]
       ignore_missing: false
      
 The supported conditions are:

 equals
 contains
 regexp
 range
 network
 has_fields
 or
 and
 Not

 # Delete the field value of status is 404
 [root@elk3 ~]# cat /etc/filebeat/config/
 :
   path: ${}//*.yml
   : true
 processors:
   - decode_json_fields:
       fields: [""]
       process_array: false
       max_depth: 1
       target: ""
       overwrite_keys: false
       add_error_key: true
   - drop_fields:
       When:
         equals:
           status: "404"
       fields: [""]
       ignore_missing: false
 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: test-modules-tomcat-%{+}
 : false
 : "test-modules-tomcat"
 : "test-modules-tomcat-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0
 [root@elk3 ~]# filebeat -e -c /etc/filebeat/config/

filebeat collection es cluster log

# Start the module
 [root@elk1 ~]# filebeat modules enable elasticsearch
 Enabled elasticsearch

 [root@elk1 ~]# cat /etc/filebeat/config/
 :
   path: ${}//*.yml
   : true

 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: es-log-modules-eslog-%{+}
 : false
 : "es-log"
 : "es-log-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0

filebeat collects mysql logs

# Deploy mysql
 [root@elk1 ~]# wget /get/Downloads/MySQL-8.4/mysql-8.4.4-linux-glibc2.28-x86_64.
 [root@elk1 ~]# tar xf mysql-8.4.4-linux-glibc2.28-x86_64. -C /usr/local/


 # Prepare to start the script and authorize it
 [root@elk1 ~]# cp /usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/support-files/ /etc//
 [root@elk1 ~]# vim /etc//
 [root@elk1 ~]# grep -E "^(basedir=|datadir=)" /etc//
 basedir=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/
 datadir=/var/lib/mysql
 [root@elk1 ~]# useradd -m mysql
 [root@elk1 ~]# install -d /var/lib/mysql -o mysql -g mysql
 [root@elk1 ~]# ll -d /var/lib/mysql
 drwxr-xr-x 2 mysql mysql 4096 Mar 25 17:05 /var/lib/mysql/

 # Prepare the configuration file
 [root@elk1 ~]# vim /etc/
 [root@elk1 ~]# cat /etc/
 [mysqld]
 basedir=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/
 datadir=/var/lib/mysql
 socket=/tmp/
 port=3306

 [client]
 socket=/tmp/

 # Start the database
 [root@elk1 ~]# vim /etc//
 [root@elk1 ~]# cat /etc//
 #!/bin/bash
 export MYSQL_HOME=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/
 export PATH=$PATH:$MYSQL_HOME/bin
 [root@elk1 ~]# source /etc//
 [root@elk1 ~]# mysqld --initialize-insecure --user=mysql --datadir=/var/lib/mysql --basedir=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64
 2025-03-25T09:08:36.829914Z 0 [System] [MY-015017] [Server] MySQL Server Initialization - start.
 2025-03-25T09:08:36.842773Z 0 [System] [MY-013169] [Server] /usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/bin/mysqld (mysqld 8.4.4) initializing of server in progress as process 7905
 2025-03-25T09:08:36.918780Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
 2025-03-25T09:08:37.818933Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
 2025-03-25T09:08:42.504501Z 6 [Warning] [MY-010453] [Server] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.
 2025-03-25T09:08:46.909940Z 0 [System] [MY-015018] [Server] MySQL Server Initialization - end.
 [root@elk1 ~]# /etc// start
 Starting (via systemctl): .
 [root@elk1 ~]# netstat -tunlp | grep 3306
 tcp6 0 0 :::3306 :::* LISTEN 8141/mysqld
 tcp6 0 0 :::33060 :::* LISTEN 8141/mysqld



 # Enable filebeat module
 [root@elk1 ~]# filebeat modules enable mysql
 Enabled mysql

 # Configure filebeat
 [root@elk1 ~]# cat /etc/filebeat/config/
 :
   path: ${}//
   : true

 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: es-modules-mysql-%{+}
 : false
 : "es-modules-mysql"
 : "es-modules-mysql-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0
 # Configure mysql modules
 [root@elk1 ~]# cat /etc/filebeat//
 # Module: mysql
 # Docs: /guide/en/beats/filebeat/7.17/

 - module: mysql
   # Error logs
   error:
     enabled: true

     # Set custom paths for the log files. If left empty,
     # Filebeat will choose the paths depending on your OS.
     #:
     : ["/var/lib/mysql/"]

   # Slow logs
   slowlog:
     enabled: true

     # Set custom paths for the log files. If left empty,
     # Filebeat will choose the paths depending on your OS.
     #:

 # Start filebeat instance
 [root@elk1 ~]# filebeat -e -c /etc/filebeat/config/

filebeat collection redis

# Install redis
 [root@elk1 ~]# apt install -y redis

 # redis log file location
 [root@elk1 ~]# cat /var/log/redis/
 8618:C 25 Mar 2025 17:18:37.442 # WARNING supervised by systemd - you MUST set appropriate values ​​for TimeoutStartSec and TimeoutStopSec in your service unit.
 8618:C 25 Mar 2025 17:18:37.442 # oO0OoO0OoO0OoOo Redis is starting oO0OoO0Oo0Oo0Oo
 8618:C 25 Mar 2025 17:18:37.442 # Redis version=6.0.16, bits=64, commit=00000000, modified=0, pid=8618, just started
 8618:C 25 Mar 2025 17:18:37.442 # Configuration loaded
                 _._
            _.-``__ ''-._
       _.-`` `. `_. ''-._ Redis 6.0.16 (00000000/0) 64 bit
   .-`` .-````. ```\/ _.,_ ''-.
  ( ' , .-` | `, ) Running in standalone mode
  |`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
  | `-._ `._ / _.-' | PID: 8618
   `-._ `-._ `-./ _.-' _.-'
  |`-._`-._ `-.__.-' _.-'_.-'|
  | `-._`-._ _.-'_.-' |
   `-._ `-._`-.__.-'_.-' _.-'
  |`-._`-._ `-.__.-' _.-'_.-'|
  | `-._`-._ _.-'_.-' |
   `-._ `-._`-.__.-'_.-' _.-'
       `-._ `-.__.-' _.-'
           `-._ _.-'
               `-.__.-'

 8618:M 25 Mar 2025 17:18:37.446 # Server initialized
 8618:M 25 Mar 2025 17:18:37.446 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/ and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
 8618:M 25 Mar 2025 17:18:37.447 * Ready to accept connections

 # Start redis modules
 [root@elk1 ~]# filebeat modules enable redis
 Enabled redis

 [root@elk1 ~]# cat /etc/filebeat/config/
 :
   path: ${}//
   : true

 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: es-modules-redis-%{+}
 : false
 : "es-modules-redis"
 : "es-modules-redis-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0

 # Start filebeat instance
 [root@elk1 ~]# filebeat -e -c /etc/filebeat/config/

Filebeat multi-line merging issue

# Manage multiline messages
# 
parsers:
- multiline:
    type: pattern
    pattern: '^\['
    negate: true
    match: after
    

	Defines which aggregation method to use. The default is pattern. The other option is count which lets you aggregate constant number of lines.

	Specifies the regular expression pattern to match. Note that the regexp patterns supported by Filebeat differ somewhat from the patterns supported by Logstash. See Regular expression 		support for a list of supported regexp patterns. Depending on how you configure other multiline options, lines that match the specified regular expression are considered either 			continuations of a previous line or the start of a new multiline event. You can set the negate option to negate the pattern.

	Defines whether the pattern is negated. The default is false.

	Specifies how Filebeat combines matching lines into an event. The settings are after or before. The behavior of these settings depends on what you specify for negate:

manager multiline redis log message

# Optimize redis cluster log collection rules through manager multiline message
 # type: filestream is an alternative to old version of logs
 [root@elk1 ~]# cat /etc/filebeat/config/
 :
 - type: filestream
   paths:
     - /var/log/redis/*
   # Configure the parser
   parsers:
     # Define multi-line matching
   - multiline:
       # Specify the matching type
       type: pattern
       # Define the matching pattern
       pattern: '^\d'
       # Reference official website: /guide/en/beats/filebeat/current/
       negate: true
       match: after
 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: es-modules-redis-%{+}
 : false
 : "es-modules-redis"
 : "es-modules-redis-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0
 [root@elk1 ~]# filebeat -e -c /etc/filebeat/config/
 #The pattern in the redis log is filtered out

manager multiline tomcat error log message

# tomcat error log path : /usr/local/apache-tomcat-11.0.5/logs/catalina.*
[root@elk2 ~]# cat /etc/filebeat/config/
:
- type: filestream
  paths:
    - /usr/local/apache-tomcat-11.0.5/logs/catalina*
  parsers:
  - multiline:
      type: pattern
      pattern: '^\d'
      negate: true
      match: after
:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: test-modules-tomcat-elk2-%{+}
: false
: "test-modules-tomcat-elk2"
: "test-modules-tomcat-elk2*"
: true
:
  index.number_of_shards: 5
  index.number_of_replicas: 0

filebeat multiple instances

1. Start Example 1
 filebeat -e -c /etc/filebeat/config/ -- /tmp/xixi


 2. Start Example 2
 filebeat -e -c /etc/filebeat/config/ -- /tmp/haha

 #Collection /var/log/syslog /var/log/
 root@elk2:~# cat /etc/filebeat/config/
 # Define where the data comes from
 :
   # The type of the data source is log, which means reading data from the file
 - type: log
   # Specify the path to the file
   paths:
     - /var/log/syslog

 # Define data to terminal
 :
   hosts:
     - 192.168.121.21:9200
     - 192.168.121.22:9200
     - 192.168.121.23:9200
   # Custom index name
   index: "test_syslog-%{+}"

 # Disable index lifecycle management (ILM)
 # If this configuration is enabled, all information about the custom index is ignored
 : false
 # If the index template exists, whether to overwrite it, the default value is false, and if it is explicitly needed, it can be set to ture.
 # However, the official suggests setting it to false, because every time you write data, you will create a tcp link, consuming resources.
 : true
 # Define the name of the index template (that is, the rule for creating the index)
 : "test_syslog"
 # Define the matching pattern of the index template to indicate which indexes the current index template takes effect.
 : "test_syslog-*"
 # # Define the rule information for index templates
 :
   # Number of shards
   index.number_of_shards: 5
   # How many copies are there for each shard
   index.number_of_replicas: 0
 root@elk2:~# cat /etc/filebeat/config/
 # Define where the data comes from
 :
   # The type of the data source is log, which means reading data from the file
 - type: log
   # Specify the path to the file
   paths:
     - /var/log/

 # Define data to terminal
 :
   hosts:
     - 192.168.121.21:9200
     - 192.168.121.22:9200
     - 192.168.121.23:9200
   # Custom index name
   index: "test_auth-%{+}"

 # Disable index lifecycle management (ILM)
 # If this configuration is enabled, all information about the custom index is ignored
 : false
 # If the index template exists, whether to overwrite it, the default value is false, and if it is explicitly needed, it can be set to ture.
 # However, the official suggests setting it to false, because every time you write data, you will create a tcp link, consuming resources.
 : true
 # Define the name of the index template (that is, the rule for creating the index)
 : "test_auth"
 # Define the matching pattern of the index template to indicate which indexes the current index template takes effect.
 : "test_auth-*"
 # # Define the rule information for index templates
 :
   # Number of shards
   index.number_of_shards: 5
   # How many copies are there for each shard
   index.number_of_replicas: 0


 # Start filebeat through multiple instances
 root@elk2:~# filebeat -e -c /etc/filebeat/config/ -- /tmp/xixi
 root@elk2:~# filebeat -e -c /etc/filebeat/config/ -- /tmp/haha

EFK analysis web cluster

Deploy web clusters

1. Deploy the tomcat server
 # 192.168.121.92 192.168.121.93 Deploy tomcat
 Refer to filebeat to collect tomcat section to deploy tomcat


 2. Deploy nginx
 # 192.168.121.91 Deploy nginx
 [root@elk1 ~]# apt install -y nginx
 [root@elk1 ~]# vim /etc/nginx/
 ...
 upstream es-web{
     server 192.168.121.92:8080;
     server 192.168.121.93:8080;
 }
 server {
     server_name ;
     location / {
         proxy_pass http://es-web;
     }
 }
 ...
 [root@elk1 ~]# nginx -t
 [root@elk1 ~]# systemctl restart nginx
 # Access Test
 [root@elk1 ~]# curl

Collect web cluster logs

#91 Load nginx module
 # 92 93 Loading the tomcat module
 [root@elk1 ~]# filebeat modules enable nginx
 Enabled nginx
 [root@elk2 ~]# filebeat modules enable tomcat
 Enabled tomcat
 [root@elk3 ~]# filebeat modules enable tomcat
 Enabled tomcat


 1. Configure nginx module functions
 [root@elk1 ~]# cat /etc/filebeat//
 # Module: nginx
 # Docs: /guide/en/beats/filebeat/7.17/

 - module: nginx
   # Access logs
   access:
     enabled: true

     # Set custom paths for the log files. If left empty,
     # Filebeat will choose the paths depending on your OS.
     : /var/log/nginx/

   # Error logs
   error:
     enabled: false

     # Set custom paths for the log files. If left empty,
     # Filebeat will choose the paths depending on your OS.
     #:

   # Ingress-nginx controller logs. This is disabled by default. It could be used in Kubernetes environments to parse ingress-nginx logs
   ingress_controller:
     enabled: false

     # Set custom paths for the log files. If left empty,
     # Filebeat will choose the paths depending on your OS.
     #:
 2. Configure the tomcat module function
 [root@elk2 ~]# cat /etc/filebeat//
 # Module: tomcat
 # Docs: /guide/en/beats/filebeat/7.17/

 - module: tomcat
   log:
     enabled: true

     # Set which input to use between udp (default), tcp or file.
     : file
     # var.syslog_host: localhost
     # var.syslog_port: 9501

     # Set paths for the log files when file input is used.
     :
       - /usr/local/apache-tomcat-11.0.5/logs/*.json

     # Toggle output of non-ECS fields (default true).
     # var.rsa_fields: true

     # Set custom timezone offset.
     # "local" (default) for system timezone.
     # "+02:00" for GMT+02:00
     # var.tz_offset: local
 3. Configure the 91filebeat configuration file
 [root@elk1 ~]# cat /etc/filebeat/config/
 :
   # Glob pattern for configuration loading
   path: ${}//*.yml

   # Set to true to enable config reloading
   : true
 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: es-web-nginx-%{+}
 : false
 : "es-web-nginx"
 : "es-web-nginx-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0
 4.92 Configuring monitoring tomcat configuration file
 [root@elk2 ~]# cat /etc/filebeat/config/
 :
   path: ${}//*.yml
   : true
 processors:
   - decode_json_fields:
       fields: [""]
       process_array: false
       max_depth: 1
       target: ""
       overwrite_keys: false
       add_error_key: true
   - drop_fields:
       When:
         equals:
           status: "404"
       fields: [""]
       ignore_missing: false
 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: test-modules-tomcat91-%{+}
 : false
 : "test-modules-tomcat91"
 : "test-modules-tomcat91-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0
 5.93 Configuration monitoring tomcat configuration file
 [root@elk3 ~]# cat /etc/filebeat/config/
 :
   path: ${}//*.yml
   : true
 processors:
   - decode_json_fields:
       fields: [""]
       process_array: false
       max_depth: 1
       target: ""
       overwrite_keys: false
       add_error_key: true
   - drop_fields:
       When:
         equals:
           status: "404"
       fields: [""]
       ignore_missing: false
 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: test-modules-tomcat93-%{+}
 : false
 : "test-modules-tomcat93"
 : "test-modules-tomcat93-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0

 # Start filebeat

Change field type

We want to count the bandwidth, but it cannot be counted at this time



 This is a character type value



 # Modify field types through processes
 # Official website configuration
 # Supported types
 # The supported types include: integer, long, float, double, string, boolean, and ip.

 processors:
   - convert:
       fields:
         - {from: "src_ip", to: "", type: "ip"}
         - {from: "src_port", to: "", type: "integer"}
       ignore_missing: true
       fail_on_error: false

 # Configure filebeat configuration file
 :
   path: ${}//*.yml
   : true
 processors:
   - decode_json_fields:
       fields: [""]
       process_array: false
       max_depth: 1
       target: ""
       overwrite_keys: false
       add_error_key: true
   - convert:
       fields:
         - {from: "SendBytes", type: "long"}
   - drop_fields:
       When:
         equals:
           status: "404"
       fields: [""]
       ignore_missing: false
 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   index: test-modules-tomcat91-%{+}
 : false
 : "test-modules-tomcat91"
 : "test-modules-tomcat91-*"
 : true
 :
   index.number_of_shards: 5
   index.number_of_replicas: 0

Ansible deploys EFK cluster

[root@ansible efk]# cat set_es.sh
#!/bin/bash
ansible-playbook 
ansible-playbook 
ansible-playbook 
ansible-playbook 
ansible-playbook 

[root@ansible efk]# bash set_es.sh

[root@ansible efk]# cat  
---
- name: Install es cluster
  hosts: all
  tasks:
    - name: get es deb package
      get_url:
        url: /downloads/elasticsearch/elasticsearch-7.17.
        dest: /root/
    - name: Install es
      shell: 
        cmd: dpkg -i /root/elasticsearch-7.17. | cat
    - name: Configer es
      copy:
        src: conf/
        dest: /etc/elasticsearch/
    - name: start es
      systemd:
        name: elasticsearch
        state: started
        enabled: yes


[root@ansible efk]# cat  
---
- name: Install kibana
  hosts: elk1
  tasks:
    - name: Get kibana deb package
      get_url:
        url: /downloads/kibana/kibana-7.17.
        dest: /root
    - name: Install kibana
      shell:
        cmd: dpkg -i kibana-7.17. | cat
    - name: Config kibana
      copy:
        src: conf/
        dest: /etc/kibana/
    - name: Start kibana
      systemd: 
        name: kibana
        state: started
        enabled: yes
[root@ansible efk]# cat  
---
- name: Install filebeat
  hosts: elk
  tasks:
    - name: Get filebeat code
      get_url:
        url: /downloads/beats/filebeat/filebeat-7.17.
        dest: /root
    - name: Install filebeat
      shell:
        cmd: dpkg -i filebeat-7.17. | cat
    - name: Configer filebeat
      file:
        path: /etc/filebeat/config
        state: directory

[root@ansible efk]# cat  
---
- name: Set nginx
  hosts: elk1
  tasks:
    - name: Install nginx
      shell:
        cmd: apt install -y nginx | cat
    - name: config nginx
      copy:
        src: conf/
        dest: /etc/nginx/
    - name: start nginx
      systemd:
        name: nginx
        state: started
        enabled: yes
    - name: Configure hosts
      copy:
        content: 192.168.121.91 
        dest: /etc/hosts
- name: Set tomcat
  hosts: elk2,elk3
  tasks:
    - name: Get tomcat code
      get_url:
        url: /tomcat/tomcat-11/v11.0.5/bin/apache-tomcat-11.0.
        dest: /root/
    - name: unarchive tomcat code
      unarchive:
        src: /root/apache-tomcat-11.0.
        dest: /usr/local
        remote_src: yes
    - name: Configure jdk PATH
      copy:
        src: conf/
        dest: /etc//
    - name: reload profile 
      shell:
        cmd: source /etc// | cat
    - name: Configure tomcat
      copy:
        src: conf/
        dest: /usr/local/apache-tomcat-11.0.5/conf/
    - name: start tomcat
      shell:
        cmd:   start |cat
[root@ansible efk]# cat  
---
- name: configure filebeat
  hosts: elk1
  tasks:
    - name: enable nginx modules
      shell:
        cmd: filebeat modules enable nginx | cat
    - name: configure nginx modules
      copy:
        src: conf/
        dest: /etc/filebeat//
    - name: configure filebeat 
      copy:
        src: conf/
        dest: /etc/filebeat/config/

- name: configure filebeat
  hosts: elk2 elk3
  tasks:
    - name: enable tomcat modules
      shell:
        cmd: filebeat modules enable nginx | cat
    - name: configure tomcat modules
      copy:
        src: conf/
        dest: /etc/filebeat//
    - name: configure filebeat
      template:
        src: conf/.j2
        dest: /etc/filebeat/config/

logstash

Install and configure logstash

1. Deploy logstash
 [root@elk3 ~]# wget /downloads/logstash/logstash-7.17.
 [root@elk3 ~]# dpkg -i logstash-7.17.

 2. Create a symbolic link and add the Logstash command to the PATH environment variable
 [root@elk3 ~]# ln -svf /usr/share/logstash/bin/logstash /usr/local/bin/
 '/usr/local/bin/logstash' -> '/usr/share/logstash/bin/logstash'

 3. Start the instance based on the command line and use the -e option to specify configuration information (not recommended)
 [root@elk3 ~]# logstash -e "input { stdin { type => stdin } } output { stdout { codec => rubydebug } }" -- warn
 ...
 The stdin plugin is now waiting for input:
 111111111111111111111111111111111111111
 {
     "@timestamp" => 2025-03-13T06:51:32.821Z,
           "type" => "stdin",
        "message" => "111111111111111111111111111111111111111",
           "host" => "elk93",
       "@version" => "1"
 }

 4. Start Logstash based on configuration file
 [root@elk3 ~]# vim /etc/logstash//
 [root@elk3 ~]# cat /etc/logstash//
 input {
   stdin {
     type => stdin
   }
 }


 output {
   stdout {
     codec => rubydebug
   }
 }

 [root@elk3 ~]# logstash -f /etc/logstash//
 ...
 333333333333333333333333333333
 {
           "type" => "stdin",
        "message" => "333333333333333333333333333333333333333333333333333333333,
           "host" => "elk93",
     "@timestamp" => 2025-03-13T06:54:20.223Z,
       "@version" => "1"
 }

 # /guide/en/logstash/7.17/
 [root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/tmp/"
   }
 }


 output {
   stdout {
     codec => rubydebug
   }
 }
 [WARN ] 2025-03-26 09:40:52.788 [[main]<file] plain - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
 {
           "path" => "/tmp/",
       "@version" => "1",
     "@timestamp" => 2025-03-26T01:40:52.879Z,
        "message" => "aaaddd",
           "host" => "elk3"
 }

Logstash collection of text log strategies

Logstash acquisition strategy is similar to filebeat acquisition strategy
	 1. The line break character shall prevail, and the collection shall be carried out according to the behavior unit
	 2. There is also the concept of offsets similar to filebeat

 [root@elk3 ~]# ll /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
 -rw-r--r-- 1 root root 53 Mar 26 09:57 /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
 [root@elk3 ~]# cat /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
 408794 0 64768 12 1742955373.9715059 /tmp/ # 12 is the offset
 [root@elk3 ~]# cat /tmp/
 ABC
 2025def
 [root@elk3 ~]# ll -i /tmp/
 408794 -rw-r--r-- 1 root root 12 Mar 26 09:45 /tmp/

 # You can directly modify the offset for the specified position to collect. We modify the offset to 8 to view the acquisition results
 {
     "@timestamp" => 2025-03-26T02:20:50.776Z,
           "host" => "elk3",
        "message" => "def",
           "path" => "/tmp/",
       "@version" => "1"
 }

start_position

If we delete the filebeat json file, the next acquisition of filebeat starts from scratch. This is not the case for logstash.

 [root@elk3 ~]# rm -f /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
 [root@elk3 ~]# logstash -f /etc/logstash//
 # Or collect the last data by default
 [root@elk3 ~]# echo 123 >> /tmp/
 [root@elk3 ~]# cat /tmp/
 ABC
 2025def
 123
 {
       "@version" => "1",
     "@timestamp" => 2025-03-26T02:26:17.008Z,
        "message" => "123",
           "host" => "elk3",
           "path" => "/tmp/"
 }

 // This time you need the star_position parameter
 start_position
 Value can be any of: beginning, end
 Default value is "end"

 [root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/tmp/"
     start_position => "beginning"
   }
 }


 output {
   stdout {
     codec => rubydebug
   }
 }


 [root@elk3 ~]# rm -f /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
 [root@elk3 ~]# logstash -f /etc/logstash//
 {
       "@version" => "1",
           "host" => "elk3",
        "message" => "2025def",
           "path" => "/tmp/",
     "@timestamp" => 2025-03-26T02:31:50.020Z
 }
 {
       "@version" => "1",
           "host" => "elk3",
        "message" => "ABC",
           "path" => "/tmp/",
     "@timestamp" => 2025-03-26T02:31:49.813Z
 }
 {
       "@version" => "1",
           "host" => "elk3",
        "message" => "123",
           "path" => "/tmp/",
     "@timestamp" => 2025-03-26T02:31:50.037Z
 }

filter plugins

# logstash output has many fields. If there are some I don't want, you can use filter plugins for filtering
 # Remove @version field
 [root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/tmp/"
     start_position => "beginning"
   }
 }
 filter {
   mutate {
     remove_field => [ "@version" ]
   }
 }

 output {
   stdout {
     codec => rubydebug
   }
 }

 # -R mode starts logstash to achieve reload
 [root@elk3 ~]# logstash -r -f /etc/logstash//
 {
     "@timestamp" => 2025-03-26T03:01:02.078Z,
           "host" => "elk3",
        "message" => "111",
           "path" => "/tmp/"
 }

logstash architecture

logstash multiple instances

Start Example 1:
 [root@elk93 ~]# logstash -f /etc/logstash//


 Start Example 2:
 [root@elk93 ~]# logstash -rf /etc/logstash// -- /tmp/logstash-multiple

logstash and pipeline relationship

- A Logstash instance can have multiple pipelines. If the pipeline id is not defined, the default is main pipeline.
	
	
	 - Each pipeline consists of three components, where the filter plugin is an optional component:
		 - input:
			 Where does the data come from?
			
		 - filter:
			 What plugins are used to process the data? This component is an optional component.
			
		 - output:
			 Where does the data go?

logstash collects nginx logs

1. Install nginx
 [root@elk3 ~]# apt install -y nginx


 Collect nginx
 [root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/var/log/nginx/"
     start_position => "beginning"
   }
 }
 filter {
   mutate {
     remove_field => [ "@version" ]
   }
 }

 output {
   stdout {
     codec => rubydebug
   }
 }
 [root@elk3 ~]# logstash -r -f /etc/logstash//
 {
           "host" => "elk3",
        "message" => "127.0.0.1 - - [26/Mar/2025:14:43:58 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"curl/7.81.0\"",
     "@timestamp" => 2025-03-26T06:45:24.375Z,
           "path" => "/var/log/nginx/"
 }
 {
           "host" => "elk3",
        "message" => "127.0.0.1 - - [26/Mar/2025:14:43:57 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"curl/7.81.0\"",
     "@timestamp" => 2025-03-26T06:45:24.293Z,
           "path" => "/var/log/nginx/"
 }
 {
           "host" => "elk3",
        "message" => "127.0.0.1 - - [26/Mar/2025:14:43:58 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"curl/7.81.0\"",
     "@timestamp" => 2025-03-26T06:45:24.373Z,
           "path" => "/var/log/nginx/"
 }

grok plugins

# Based on regular extraction
 Logstash ships with about 120 patterns by default. You can find them here: /logstash-plugins/logstash-patterns-core/tree/master/patterns.

 [root@elk3 ~]# cat /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.3.4/patterns/legacy/httpd
 HTTPDUSER %{EMAILADDRESS}|%{USER}
 HTTPDERROR_DATE %{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}

 # Log formats
 HTTPD_COMMONLOG %{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes})
 HTTPD_COMBINEDLOG %{HTTPD_COMMONLOG} %{QS:referrer} %{QS:agent}

 # Error logs
 HTTPD20_ERRORLOG \[%{HTTPDERROR_DATE:timestamp}\] \[%{LOGLEVEL:loglevel}\] (?:\[client %{IPORHOST:clientip}\] ){0,1}%{GREEDYDATA:message}
 HTTPD24_ERRORLOG \[%{HTTPDERROR_DATE:timestamp}\] \[(?:%{WORD:module})?:%{LOGLEVEL:loglevel}\] \[pid %{POSINT:pid}(:tid %{NUMBER:tid})?\]( \(%{POSINT:proxy_errorcode}\)%{DATA:proxy_message}:)?( \[client %{IPORHOST:clientip}:%{POSINT:clientport}\])?( %{DATA:errorcode}:)?  %{GREEDYDATA:message}
 HTTPD_ERRORLOG %{HTTPD20_ERRORLOG}|%{HTTPD24_ERRORLOG}

 # Deprecated
 COMMONAPACHELOG %{HTTPD_COMMONLOG}
 COMBINEDAPACHELOG %{HTTPD_COMBINEDLOG}


 # Configure logstash
 [root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/var/log/nginx/"
     start_position => "beginning"
   }
 }
 filter {
   mutate {
     remove_field => [ "@version" ]
   }
   # Extract arbitrary text based on regularity and encapsulate it into a specific field.  Use the set template
   grok {
         match => { "message" => "%{HTTPD_COMMONLOG}" }
   }
 }

 output {
   stdout {
     codec => rubydebug
   }
 }

 [root@elk3 ~]# logstash -r -f /etc/logstash//
 {
         "message" => "192.168.121.1 - - [26/Mar/2025:14:52:06 +0800] \"GET / HTTP/1.1\" 200 396 \"-\" \"Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Mobile Safari/537.36\"",
            "path" => "/var/log/nginx/",
         "request" => "/",
        "clientip" => "192.168.121.1",
            "host" => "elk3",
       "timestamp" => "26/Mar/2025:14:52:06 +0800",
            "auth" => "-",
            "verb" => "GET",
        "response" => "200",
           "ident" => "-",
     "httpversion" => "1.1",
      "@timestamp" => 2025-03-26T06:52:07.342Z,
           "bytes" => "396"
 }

useragent plugins

Used to extract user's device information

 [root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/var/log/nginx/"
     start_position => "beginning"
   }
 }
 filter {
   mutate {
     remove_field => [ "@version" ]
   }
   grok {
         match => { "message" => "%{HTTPD_COMMONLOG}" }
   }
   useragent {
  	 # Specify which field to parse user device information
     source => 'message'
     # Store the parsed results in a specific field. If not specified, it will be placed in the top-level field by default.
     target => "xu-ua"
   }

 }

 [root@elk3 ~]# logstash -r -f /etc/logstash//
 output {
   stdout {
     codec => rubydebug
   }
 }

 {
         "message" => "192.168.121.1 - - [26/Mar/2025:16:45:10 +0800] \"GET / HTTP/1.1\" 200 396 \"-\" \"Mozilla/5.0 (Linux; Android 13; SM-G981B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Mobile Safari/537.36\"",
        "clientip" => "192.168.121.1",
       "timestamp" => "26/Mar/2025:16:45:10 +0800",
         "request" => "/",
           "bytes" => "396",
            "verb" => "GET",
     "httpversion" => "1.1",
      "@timestamp" => 2025-03-26T08:45:11.587Z,
            "host" => "elk3",
            "auth" => "-",
           "xu-ua" => {
               "name" => "Chrome Mobile",
            "version" => "134.0.0.0",
                 "os" => "Android",
            "os_name" => "Android",
         "os_version" => "13",
             "device" => "Samsung SM-G981B",
            "os_full" => "Android 13",
              "minor" => "0",
           "os_major" => "13",
              "patch" => "0",
              "major" => "134"
     },
           "ident" => "-",
            "path" => "/var/log/nginx/",
        "response" => "200"
 }

geoip plugins

Analyze your latitude and longitude coordinate points based on public network IP addresses

 [root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/var/log/nginx/"
     start_position => "beginning"
   }
 }
 filter {
   mutate {
     remove_field => [ "@version" ]
   }
   grok {
         match => { "message" => "%{HTTPD_COMMONLOG}" }
   }
   useragent {
     source => 'message'
     target => "xu-ua"
   }
   geoip {
	 source => "clientip"
   }

 }
 [root@elk3 ~]# logstash -r -f /etc/logstash//
 output {
   stdout {
     codec => rubydebug
   }
 }

 "geoip" => {
              "longitude" => -119.705,
          "country_code2" => "US",
            "region_name" => "Oregon",
               "timezone" => "America/Los_Angeles",
                     "ip" => "52.222.36.125",
         "continent_code" => "NA",
          "country_code3" => "US",
               "latitude" => 45.8401,
           "country_name" => "United States",
               "dma_code" => 810,
            "postal_code" => "97818",
            "region_code" => "OR",
               "location" => {
             "lat" => 45.8401,
             "lon" => -119.705
         },
              "city_name" => "Boardman"
     }

date plugins

[root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/var/log/nginx/"
     start_position => "beginning"
   }
 }
 filter {
   mutate {
     remove_field => [ "@version" ]
   }
   grok {
         match => { "message" => "%{HTTPD_COMMONLOG}" }
   }
   useragent {
     source => 'message'
     target => "xu-ua"
   }
   geoip {
	 source => "clientip"
   }
   date {
 	 # Match the date field, convert it to date format, and store it in ES in the future, and use the corresponding format based on the official example.
     # /guide/en/logstash/7.17/#plugins-filters-date-match
     match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
     # Overwrite the modified value of the date matched to the specified field directly. If it is not defined, the default is overridden "@timestamp".
     target => "xu-timestamp"
   }

 }

 output {
   stdout {
     codec => rubydebug
   }
 }

 [root@elk3 ~]# logstash -r -f /etc/logstash//
 "xu-timestamp" => 2025-03-26T09:17:18.000Z,

mutate plugins

If we want to count the bandwidth we will find "bytes" => "396"
 It is a string type and cannot be accumulated, so you need to use mutate plugins to convert the type

 [root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/var/log/nginx/"
     start_position => "beginning"
   }
 }
 filter {
   mutate {
     convert => {
       "bytes" => "integer"
     }
     remove_field => [ "@version" ]
   }
   grok {
         match => { "message" => "%{HTTPD_COMMONLOG}" }
   }
   useragent {
     source => 'message'
     target => "xu-ua"
   }
   geoip {
	 source => "clientip"
   }
   date {
     match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
     target => "xu-timestamp"
   }

 }

 output {
   stdout {
     codec => rubydebug
   }
 }

logstash collect log output to es

[root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/var/log/nginx/"
     start_position => "beginning"
   }
 }


 filter {
   grok {
     match => { "message" => "%{HTTPD_COMMONLOG}" }
   }

   useragent {
     source => "message"
     target => "xu_user_agent"
   }

   geoip {
     source => "clientip"
   }

   date {
     match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
     target => "xu-timestamp"
   }
 
   # Convert the specified fields
   mutate {
     # Convert the specified field to the type we need to convert
     convert => {
       "bytes" => "integer"
     }
    
     remove_field => [ "@version","host","message" ]
   }
 }

 output {
   stdout {
     codec => rubydebug
   }

   elasticsearch {
       # List of corresponding ES cluster hosts
       hosts => ["10.0.0.91:9200","10.0.0.92:9200","10.0.0.93:9200"]
       # The index name of the corresponding ES cluster
       index => "xu-elk-nginx"
   }
 }

 Existing problems:
	 Failed (timed out waiting for connection to open). Sleeping for 0.02

 Question description:
	 This issue is in ElasticStack version 7.17.28, and Logstash cannot write to ES.
	
 TODO:
	 It is necessary to investigate whether the official has made changes, which cannot be successfully written and requires additional parameter configuration.

 Temporary Solution:
	 - Rewind version to version 7.17.23.
	 - Comment out the geoip configuration

Solve the problem of the long recognition time of geoip plugins when writing es

By checking the official website, we can see that the geoip module can specify the database. We solve this problem by specifying the database.

 1. Check out Logstash local default geoip plugin
 [root@elk3 ~]# tree /usr/share/logstash/data/plugins/filters/geoip/1742980310/
 /usr/share/logstash/data/plugins/filters/geoip/1742980310/
 ├──
 ├──
 ├──
 ├──
 ├──
 ├──
 ├──
 └──

 0 directories, 8 files

 2. Configure logstash
 [root@elk3 ~]# cat /etc/logstash//
 input {
   file {
     path => "/var/log/nginx/"
     start_position => "beginning"
   }
 }

 filter {
   mutate {
     convert => {
       "bytes" => "integer"
     }
     remove_field => [ "@version" ]
   }
   grok {
         match => { "message" => "%{HTTPD_COMMONLOG}" }
   }
   useragent {
     source => 'message'
     target => "xu-ua"
   }
   geoip {
	 source => "clientip"
	 database => "/usr/share/logstash/data/plugins/filters/geoip/1742980310/"
   }
   date {
     match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
     target => "xu-timestamp"
   }

 }

 output {
   stdout {
     codec => rubydebug
   }
   elasticsearch {
	 index => "xu-logstash"
	 hosts => ["http://192.168.121.91:9200","http://192.168.121.92:9200","http://192.168.121.93:9200"]
   }
 }
 [root@elk3 ~]#

Solve the problem of incorrect data type

At this time, the latitude and longitude are float type, and the picture cannot be produced.

Create index templates in kibana

ELFK architecture

json plugin case

graph LR
 filebeat--->|Send|logstash

 Let logstash receive data collected by filebeat, logstash takes precedence over filebeat startup
 That is, the input plugins type in logstash is beats

 # Configure logstash
 [root@elk3 ~]# grep -v "^#" /etc/logstash//
 input {
   beats {
       port => 5044
   }
 }

 filter {
   mutate {
     remove_field => [ "@version","host","agent","ecs","tags","input","log" ]
   }
   json {
      source => "message"
    }
 }

 output {
   stdout {
     codec => rubydebug
   }
 }


 # Configure filebeat
 [root@elk1 ~]# cat /etc/filebeat/config/
 :
 - type: filestream
   paths:
     - /tmp/
 :
   hosts: ["192.168.121.93:5044"]

 # Start logstash first, start filebeat
 [root@elk3]# logstash -rf
 [root@elk3 ~]# netstat -tunlp | grep 5044
 tcp6 0 0 :::5044 :::* LISTEN 120181/java

 # Start filebeat
 [root@elk1 ~]# filebeat -e -c /etc/filebeat/config/


 // Prepare test data
 {
   "name":"aaa",
   "hobby":["Writing novels", "Singing"]
 }
 {
   "name":"bbb",
   "hobby":["Fitness", "Billiards", "Dou Dou"]
 }
 {
   "name":"ccc",
   "hobby":["Table tennis", "Swimming", "Game"]
 }
 {
    "name": "ddd",
    "hobby": ["playing games", "playing basketball"]
 }

 # Check the collection results. Since the collection rules of filebeat are collected by row, it has collected multiple pieces of data we prepared.
 "message" => " \"name\": \"dd\",",
 "message" => " \"hobby\": [\"play game\",\"play basketball\"]",
 ...

 # Multi-line merge of filebeat is required for processing
 [root@elk1 ~]# cat /etc/filebeat/config/
 :
 - type: filestream
   paths:
     - /tmp/
   parsers:
     - multiline:
         type: count
         count_lines: 4
 :
   hosts: ["192.168.121.93:5044"]

 # Check the data collection situation
 {
        "message" => "{\n \"name\":\"aaa\",\n \"hobby\":[\"Writing novels\",\"Singing\"]\n}",
           "name" => "aaa",
          "hobby" => [
         [0] "Writing a novel",
         [1] "Singing"
     ],
     "@timestamp" => 2025-03-27T07:46:14.390Z
 }

Write to es

[root@elk3 ~]# grep -v "^#" /etc/logstash//
input { 
  beats {
      port => 5044
  }
} 

filter {
  mutate {
    remove_field => [ "@version","host","agent","ecs","tags","input","log" ]
  }
  json {
     source => "message"
   }
}

output { 
  stdout { 
    codec => rubydebug 
  } 
  elasticsearch {
	hosts => ["http://192.168.121.91:9200"]
  }
}

ELFK architecture sorting out e-commerce indicator sharing project cases

1. Generate test data
 [root@elk1 ~]# cat
 #!/usr/bin/env python
 # -*- coding: UTF-8 -*-
 # @author : Jason Yin

 import datetime
 import random
 import logging
 import time
 import sys

 LOG_FORMAT = "%(levelname)s %(asctime)s [.%(module)s] - %(message)s "
 DATE_FORMAT = "%Y-%m-%d %H:%M:%S"

 # Configure the basic configuration of root instance
 (level=, format=LOG_FORMAT, datefmt=DATE_FORMAT, filename=[1]
 , filemode='a',)
 actions = ["Browse page", "Comment item", "Add to favorites", "Add to cart", "Submit order", "Use coupons", "Receive coupons",
  "Search", "View Order", "Pay", "Clear Shopping Cart"]

 While True:
     ((1, 5))
     user_id = (1, 10000)
     # Keep 2 significant digits for the generated floating point number.
     price = round((15000, 30000),2)
     action = (actions)
     svip = ([0,1,2])
     ("DAU|{0}|{1}|{2}|{3}".format(user_id, action,svip,price))
 [root@elk1 ~]# python3 /tmp/

 2. View data content
 [root@elk1 ~]# tail -f /tmp/
 ...
 INFO 2025-03-27 17:03:10 [-log] - DAU|7973|Add to Cart|0|19300.65
 INFO 2025-03-27 17:03:13 [-log] - DAU|8617|Add to Cart|2|19720.57
 INFO 2025-03-27 17:03:14 [-log] - DAU|6879|Search|2|24774.85
 INFO 2025-03-27 17:03:19 [-log] - DAU|804| Reading|2|21352.22
 INFO 2025-03-27 17:03:22 [-log] - DAU|3014|Clear the shopping cart|0|19908.62
 ...

 # Start logstash instance
 [root@elk3]# cat 06-beats_apps
 input {
   beats {
       port => 9999
   }
 }
 filter {
 mutate {
     split => { "message" => "|" }

     add_field => {
       "other" => "%{[message][0]}"
       "userId" => "%{[message][1]}"
       "action" => "%{[message][2]}"
       "svip" => "%{[message][3]}"
       "price" => "%{[message][4]}"
     }
 }
 mutate{
	 split => { "other" => " " }

     add_field => {
        datetime => "%{[other][1]} %{[other][2]}"
     }
    
     convert => {
        "price" => "float"
      }
	 remove_field => [ "@version","host","agent","ecs","tags","input","log","message","other"]
   }
 }
 output {
 # stdout {
 # codec => rubydebug
 # }

   elasticsearch {
      index => "linux96-logstash-elfk-apps"
      hosts => ["http://192.168.121.91:9200","http://192.168.121.92:9200","http://192.168.121.93:9200"]
   }
 }
 # Start filebeat instance
 [root@elk1 ~]# cat /etc/filebeat/config/
 :
 - type: filestream
   paths:
     - /tmp/

 :
   hosts: ["192.168.121.93:9999"]

ELK architecture

logstash if statement

If statements are supported in logstash. If there are multiple inputs, different filters can be performed through if and different outputs.

 # Configure logstash if
 [root@elk3 ~]# cat /etc/logstash//
 input {
	 beats {
		 port => 9999
		 type => "xu-filebeat"
	 }
	 file {
		 path => "/var/log/nginx/"
		 start_position => "beginning"
    	 type => "xu-file"
	 }
	 tcp {
		 port => 8888
    	 type => "xu-tcp"
	 }
 }

 fileter {
	 if [type] == "xu-tcp" {
	    mutate {
        add_field => {
           school => "school1"
           class => "one"
        }
        remove_field => [ "@version","port"]
      }
	 } else if [type] == "xu-filebeat" {
		 mutate {
         split => { "message" => "|" }

         add_field => {
           "other" => "%{[message][0]}"
           "userId" => "%{[message][1]}"
           "action" => "%{[message][2]}"
           "svip" => "%{[message][3]}"
           "price" => "%{[message][4]}"
           "address" => "1.1.1.1"
         }

       }

       mutate {

         split => { "other" => " " }

         add_field => {
            datetime => "%{[other][1]} %{[other][2]}"
         }
        
         convert => {
            "price" => "float"
          }

         remove_field => [ "@version","host","agent","ecs","tags","input","log","message","other"]
       }

       date {
         match => [ "datetime", "yyyy-MM-dd HH:mm:ss" ]
       }
	 } else {
		 grok {
         match => { "message" => "%{HTTPD_COMMONLOG}" }
       }
    
       useragent {
         source => "message"
         target => "xu_user_agent"
       }
    
       geoip {
         source => "clientip"
         database => "/usr/share/logstash/data/plugins/filters/geoip/CC/"
       }
    
       date {
         match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
         target => "xu-timestamp"
       }
     
       mutate {
         convert => {
           "bytes" => "integer"
         }
        
         add_field => {
            office => ""
         }

         remove_field => [ "@version","host","message" ]
       }
	 }
 }

 output {
	 if [type] == "xu-filebeat" {
       elasticsearch {
          index => "xu-logstash-if-filebeat"
          hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
       }
   } else if [type] == "xu-tcp" {
       elasticsearch {
          index => "xu-logstash-if-tcp"
          hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
       }

   }else {
       elasticsearch {
          index => "xu-logstash-if-file"
          hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
       }
   }
 }
 [root@elk3 ~]#

pipeline

#pipline configuration file location
 [root@elk3 ~]# ll /etc/logstash/
 -rw-r--r-- 1 root root 285 Feb 18 18:52 /etc/logstash/

 # Modify the pipline configuration file
 [root@elk3 ~]# tail -4 /etc/logstash/
 - : xixi
   : "/etc/logstash//"
 - : haha
   : "/etc/logstash//"

 # Start logstash, you can start directly through logstash -r without specifying the configuration file
 [root@elk3 ~]# logstash -r
	 # There will be an error directly ERROR: Failed to read pipelines yaml file. Location: /usr/share/logstash/config/
	 # logstash will search for this file by default in /usr/share/logstash/config/. At this time, we can make a soft link.

 # Configure soft links
 [root@elk3 ~]# mkdir /usr/share/logstash/config/
 [root@elk3 ~]# ln -svf /etc/logstash/ /usr/share/logstash/config/
 '/usr/share/logstash/config/' -> '/etc/logstash/'
 [root@elk3 ~]# logstash -r
 ...
 [INFO ] 2025-03-29 10:16:50.372 [[xixi]-pipeline-manager] javapipeline - Pipeline started {""=>"xixi"}
 [INFO ] 2025-03-29 10:16:54.380 [[haha]-pipeline-manager] javapipeline - Pipeline started {""=>"haha"}
 ...

ES cluster security

Base_auth encryption based

es cluster encryption

# It can be accessed normally before configuring encryption
 [root@elk1 ~]# curl 127.1:9200/_cat/nodes
 192.168.121.92 6 94 0 0.05 0.03 0.00 cdfhilmrstw - elk2
 192.168.121.91 22 95 4 0.25 0.27 0.25 cdfhilmrstw * elk1
 192.168.121.93 29 94 6 0.02 0.25 0.48 cdfhilmrstw - elk3


 1 Generate certificate file
 [root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil cert -out /etc/elasticsearch/elastic-certificates.p12 -pass ""
 ...
 Certificates written to /etc/elasticsearch/elastic-certificates.p12

 This file should be properly secured as it contains the private key for
 your instance.

 This file is a self contained file and can be copied and used 'as is'
 For each Elastic product that you wish to configure, you should copy
 this '.p12' file to the relevant configuration directory
 and then follow the SSL configuration instructions in the product guide.


 2. Copy the certificate file to other nodes
 [root@elk1 ~]# chmod 640 /etc/elasticsearch/elastic-certificates.p12
 [root@elk1 ~]# scp -r /etc/elasticsearch/elastic-certificates.p12 192.168.121.92:/etc/elasticsearch/elastic-certificates.p12
 [root@elk1 ~]# scp - /etc/elasticsearch/elastic-certificates.p12 192.168.121.93:/etc/elasticsearch/elastic-certificates.p12

 3. Modify the configuration file of the ES cluster and synchronize it to all nodes
 [root@elk1 ~]# tail -5 /etc/elasticsearch/
 : true
 : true
 .verification_mode: certificate
 : elastic-certificates.p12
 : elastic-certificates.p12
 [root@elk1 ~]# scp /etc/elasticsearch/ 192.168.121.92:/etc/elasticsearch/
 [root@elk1 ~]# scp /etc/elasticsearch/ 192.168.121.93:/etc/elasticsearch/

 4. Restart es
 [root@elk1 ~]# systemctl restart

 # You can't access directly at this time
 [root@elk1 ~]# curl 127.1:9200
 {"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm=\"security\"  charset=\"UTF-8\""}},"status":401}


 5. Generate random passwords
 [root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-setup-passwords auto

 Changed password for user apm_system
 PASSWORD apm_system = aBsQ3WI9ydUVTx2hk2JT

 Changed password for user kibana_system
 PASSWORD kibana_system = xoMBWbFyYmadDyrYcwyI

 Changed password for user kibana
 PASSWORD kibana = xoMBWbFyYmadDyrYcwyI

 Changed password for user logstash_system
 PASSWORD logstash_system = fWx19jXFHinpraglh8E

 Changed password for user beats_system
 PASSWORD beats_system = NgKipgH0LfnFGFAazun6

 Changed password for user remote_monitoring_user
 PASSWORD remote_monitoring_user = Af4hu6PrhPYvn2S5zcEj

 Changed password for user elastic
 PASSWORD elastic = 0Nj2dpMTSNYurPqQHInA


 [root@elk1 ~]# curl -u elastic:MSfRhWKA3lRhufYpxF9u 127.1:9200/_cat/nodes
 192.168.121.91 40 96 22 0.62 0.74 0.53 cdfhilmrstw - elk1
 192.168.121.92 17 96 20 0.44 0.67 0.36 cdfhilmrstw * elk2
 192.168.121.93 23 96 32 0.54 1.00 0.73 cdfhilmrstw - elk3


 Connect es
	 6.1 Modify the kibana configuration file
 [root@elk1 ~]# tail -2 /etc/kibana/
 : "kibana_system"
 : "47UD4ZOypuWO100QciH4"
	 6.2 Restart kibana
 [root@elk1 ~]# systemctl restart
	 6.3web access kibana

Reset es password

There is a role similar to the root user in the es cluster. We can create a role that belongs to the superuser and modify the elastic password through this user.

 1. Create a Super Administrator Role
 [root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-users useradd xu -p 123456 -r superuser

 2. Modify password based on administrator
 [root@elk1 ~]# curl -s --user xu:123456 -XPUT "http://localhost:9200/_xpack/security/user/elastic/_password?pretty" -H 'Content-Type: application/json' -d'
      {
        "password" : "654321"
      }'
     
 [root@elk1 ~]# curl -uelastic:654321 127.1:9200/_cat/nodes
 192.168.121.91 35 96 7 0.38 0.43 0.52 cdfhilmrstw - elk1
 192.168.121.92 20 96 2 0.20 0.20 0.25 cdfhilmrstw * elk2
 192.168.121.93 27 97 5 0.10 0.18 0.38 cdfhilmrstw - elk3

filebeat docking es encryption

[root@elk1 ~]# cat /etc/filebeat/config/07-tcp-to-es_tls.yaml
 :
 - type: tcp
   host: "0.0.0.0:9000"


 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   # Specify the user name to connect to the ES cluster
   username: "elastic"
   # Specify the password to connect to the ES cluster
   password: "654321"
   index: xu-es-tls-filebeat

 : false
 : "xu-es-tls-filebeat"
 : "xu-es-tls-filebeat-*"
 : true
 :
   index.number_of_shards: 3
   index.number_of_replicas: 0

logstash docking es encryption

[root@elk3 ~]# cat /etc/logstash//09-tcp-to-es_tls.conf
input { 
  tcp {
    port => 8888
  }
}  

output { 

  elasticsearch {
    hosts => ["192.168.121.91:9200","192.168.121.92:9200","192.168.121.93:9200"]
    index => "oldboyedu-logstash-tls-es"
    user => elastic
    password => "654321"
  }
}

api-key

Why enable api-key
	 For security, authentication using username and password will expose user information.
	 ElasticSearch also supports api-key authentication.  This ensures safety.  api-key cannot be used to log in to kibana, and its security is guaranteed.
	 And permission control can be implemented based on api-key.

 By default, elasticsearch does not start the API, and it is necessary to set and start the API function through the configuration file.

Start the es api function

[root@elk1 ~]# tail /etc/elasticsearch/
 # Enable api_key function
 .api_key.enabled: true
 # Specify API key encryption algorithm
 .api_key.: pbkdf2
 # cached API key time
 .api_key.: 1d
 # Upper limit of the number of API keys saved
 .api_key.cache.max_keys: 10000
 # Hash algorithm for API key credentials cached in memory
 .api_key.cache.hash_algo: ssha256

 [root@elk1 ~]# !scp
 scp /etc/elasticsearch/ 192.168.121.93:/etc/elasticsearch/
 [email protected]'s password:
                                                                                                                                           100% 4270 949.6KB/s 00:00
 [root@elk1 ~]# scp /etc/elasticsearch/ 192.168.121.92:/etc/elasticsearch/
 [email protected]'s password:


 [root@elk1 ~]# systemctl restart

Create API

# parse api
 [root@elk1 ~]# echo "TzBCTzY1VUJiWUdnVHlBNjZRTXc6eE9JWW9wT3dTT09Sam1UNE5RYnRjUQ==" | base64 -d ;echo
 O0BO65UBbYGgTyA66QMw:xOIYopOwSOORjmT4NQbtcQ


 # Configure filebeat
 [root@elk1 ~]# cat /etc/filebeat/config/07-tcp-to-es_tls.yaml
 :
 - type: tcp
   host: "0.0.0.0:9000"


 :
   hosts:
   - 192.168.121.91:9200
   - 192.168.121.92:9200
   - 192.168.121.93:9200
   #username: "elastic"
   #password: "654321"
   api_key: zvWA4JUBqFmHNaf3P8bM:d-goeFONRPelMuRxSr2Bxg
   index: xu-es-tls-filebeat

 : false
 : "xu-es-tls-filebeat"
 : "xu-es-tls-filebeat-*"
 : true
 :
   index.number_of_shards: 3
   index.number_of_replicas: 0
 [root@elk1 ~]# filebeat -e -c /etc/filebeat/config/07-tcp-to-es_tls.yaml

Create api-key and implement permission management based on ES

Reference link:
/guide/en/beats/filebeat/7.17/
/guide/en/elasticsearch/reference/7.17/#privileges-list-cluster
/guide/en/elasticsearch/reference/7.17/#privileges-list-indices

1. Create api-key
 # Send a request
 POST /_security/api_key
 {
   "name": "jasonyin2020",
   "role_descriptors": {
     "filebeat_monitoring": {
       "cluster": ["all"],
       "index": [
         {
           "names": ["xu-es-apikey*"],
           "privileges": ["create_index", "create"]
         }
       ]
     }
   }
 }
 # Return data
 {
   "id" : "0vXs4ZUBqFmHNaf3s8Zn",
   "name" : "jasonyin2020",
   "api_key" : "y1Vi5fL6RfGy_B47YWBXcw",
   "encoded" : "MHZYczRaVUJxRm1ITmFmM3M4Wm46eTFWaTVmTDZSZkd5X0I0N1lXQlhjdw=="
 }


 #Analysis
 [root@elk1 ~]# echo MHZYczRaVUJxRm1ITmFmM3M4Wm46eTFWaTVmTDZSZkd5X0I0N1lXQlhjdw== | base64 -d ;echo
 0vXs4ZUBqFmHNaf3s8Zn:y1Vi5fL6RfGy_B47YWBXcw

https

es cluster configuration https

1. Build a self-built CA certificate
 [root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil ca --out /etc/elasticsearch/elastic-stack-ca.p12 --pass ""
 [root@elk1 ~]# ll /etc/elasticsearch/elastic-stack-ca.p12
 -rw------- 1 root elasticsearch 2672 Mar 29 20:44 /etc/elasticsearch/elastic-stack-ca.p12


 2. Create ES certificate based on CA certificate
 [root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca /etc/elasticsearch/elastic-stack-ca.p12 --out /etc/elasticsearch/elastic-certificates-https.p12 --pass "" --days 3650 --ca-pass ""
 [root@elk1 ~]# ll /etc/elasticsearch/elastic-stack-ca.p12
 -rw------- 1 root elasticsearch 2672 Mar 29 20:44 /etc/elasticsearch/elastic-stack-ca.p12
 [root@elk1 ~]# ll /etc/elasticsearch/elastic-certificates-https.p12
 -rw------- 1 root elasticsearch 3596 Mar 29 20:48 /etc/elasticsearch/elastic-certificates-https.p12


 3. Modify the configuration file
 [root@elk1 ~]# tail -2 /etc/elasticsearch/
 : true
 : elastic-certificates-https.p12

 [root@elk1 ~]# chmod 640 /etc/elasticsearch/elastic-certificates-https.p12

 [root@elk1 ~]# scp -rp /etc/elasticsearch/elastic{-certificates-https.p12,} 192.168.121.92:/etc/elasticsearch/
 [email protected]'s password:
 elastic-certificates-https.p12 100% 3596 1.6MB/s 00:00
                                                                                                                                           100% 4378 6.0MB/s 00:00
 [root@elk1 ~]# scp -rp /etc/elasticsearch/elastic{-certificates-https.p12,} 192.168.121.93:/etc/elasticsearch/
 [email protected]'s password:
 elastic-certificates-https.p12 100% 3596 894.2KB/s 00:00
  

 4. Restart the ES cluster
 [root@elk1 ~]# systemctl restart
 [root@elk2 ~]# systemctl restart
 [root@elk3 ~]# systemctl restart

 [root@elk1 ~]# curl https://127.1:9200/_cat/nodes -u elastic:654321 -k
 192.168.121.92 16 94 63 1.88 0.92 0.35 cdfhilmrstw - elk2
 192.168.121.91 14 96 30 0.79 0.90 0.55 cdfhilmrstw * elk1
 192.168.121.93 8 97 53 1.22 0.71 0.33 cdfhilmrstw - elk3

 5. Modify the configuration of kibana to skip self-built certificate verification
 [root@elk1 ~]# vim /etc/kibana/
 ...
 # The address protocol pointing to the ES cluster is https
 : ["https://192.168.121.91:9200","https://192.168.121.92:9200","https://192.168.121.93:9200"]
 # Skip certificate verification
 : none
 [root@elk1 ~]# systemctl restart

filebeat docking https encryption

# Write filebeat configuration file
 [root@elk92 filebeat]# cat
 :
 - type: tcp
   host: "0.0.0.0:9000"


 :
   hosts:
   - https://192.168.121.91:9200
   - https://192.168.121.92:9200
   - https://192.168.121.93:9200
   api_key: "m1wPlJUBrDbi_DeiIc-1:RcEw7Mk2QQKH_CGhMBnfbg"
   index: xu-es-apikey-tls-2025
   # Configure tls for es cluster, skip certificate verification here.  The default value is: full
   # Reference link:
   # /guide/en/beats/filebeat/7.17/#client-verification-mode
   ssl.verification_mode: none

 : false
 : "xu"
 : "xu*"
 : true
 :
   index.number_of_shards: 3
   index.number_of_replicas: 0

logstash docking https encryption

[root@elk93 logstash]# cat 13-tcp-to-es_api
 input {
   tcp {
     port => 8888
   }
 }

 output {

   elasticsearch {
     hosts => ["192.168.121.91:9200","192.168.121.92:9200","192.168.121.93:9200"]
     index => "xu-api-key"
     #user => elastic
     #password => "123456"xu
     # Specify the api-key authentication method
     api_key => "oFwZlJUBrDbi_DeiLc9O:HWBj0LC2RWiUNTudV-6CBw"
   
     # Use api-key to start ssl
     ssl => true

     # Skip SSL certificate verification
     ssl_certificate_verification => false
   }
 }
 [root@elk93 logstash]#
 [root@elk93 logstash]# logstash -rf 13-tcp-to-es_api

Implementing RBAC based on kibana

Reference link:
/guide/en/elasticsearch/reference/7.17/

Create a role

Create a user

ES8 deployment

Single-point deployment of ES8 clusters

Environmental preparation:
	 192.168.121.191 elk191
	 192.168.121.192 elk192
	 192.168.121.193 elk193

 1. Get the installation package and install es8
 [root@elk191 ~]# wget /downloads/elasticsearch/elasticsearch-8.17.
 [root@elk191 ~]# dpkg -i elasticsearch-8.17.


 # es8 supports https by default
 --------------------------- Security autoconfiguration information ------------------------------

 Authentication and authorization are enabled.
 TLS for the transport and HTTP layers is enabled and configured.

 The generated password for the elastic built-in superuser is: P0-MRYuCOTFj*4*rGNZk # The built-in elastic superuser password

 If this node should join an existing cluster, you can reconfigure this with
 '/usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token <token-here>'
 After creating an enrollment token on your existing cluster.

 You can complete the following actions at any time:

 Reset the password of the elastic built-in superuser with
 '/usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic'.

 Generate an enrollment token for Kibana instances with
  '/usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana'.

 Generate an enrollment token for Elasticsearch nodes with
 '/usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node'.

 -------------------------------------------------------------------------------------------------
 ### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd
  sudo systemctl daemon-reload
  sudo systemctl enable
 ### You can start elasticsearch service by executing
  sudo systemctl start


 2. Start es8
 [root@elk191 ~]# systemctl enable --now
 Created symlink /etc/systemd/system// → /lib/systemd/system/.
 [root@elk191 ~]# netstat -tunlp | grep -E "9[2|3]00"
 tcp6 0 0 127.0.0.1:9300 :::* LISTEN 1669/java
 tcp6 0 0 ::1:9300 :::* LISTEN 1669/java
 tcp6 0 0 :::9200 :::* LISTEN 1669/java

 3. Test access
 [root@elk191 ~]# curl -u elastic:NVPLcMy0_n8aGL=UGAGc https://127.1:9200 -k
 {
   "name" : "elk191",
   "cluster_name" : "elasticsearch",
   "cluster_uuid" : "-cw1TGvZSau0J2x-ThOJsg",
   "version" : {
     "number" : "8.17.3",
     "build_flavor" : "default",
     "build_type" : "deb",
     "build_hash" : "a091390de485bd4b127884f7e565c0cad59b10d2",
     "build_date" : "2025-02-28T10:07:26.089129809Z",
     "build_snapshot" : false,
     "lucene_version" : "9.12.0",
     "minimum_wire_compatibility_version" : "7.17.0",
     "minimum_index_compatibility_version" : "7.0.0"
   },
   "tagline" : "You Know, for Search"
 }
 [root@elk191 ~]# curl -u elastic:NVPLcMy0_n8aGL=UGAGc https://127.1:9200/_cat/nodes -k
 127.0.0.1 9 97 13 0.35 0.59 0.31 cdfhilmrstw * elk191

Deploy kibana8

1. Get the installation package and install kibana
 [root@elk191 ~]# wget /downloads/kibana/kibana-8.17.
 [root@elk191 ~]# dpkg -i kibana-8.17.

 2. Configure kibana
 [root@elk191 ~]# grep -vE "^$|^#" /etc/kibana/
 : 5601
 : "0.0.0.0"
 logging:
   appenders:
     file:
       type: file
       fileName: /var/log/kibana/
       layout:
         type: json
   root:
     appenders:
       - default
       - file
 : /run/kibana/
 : "zh-CN"

 3. Start kibana
 [root@elk191 ~]# systemctl enable --now
 [root@elk191 ~]# ss -ntl | grep 5601
 LISTEN 0 511 0.0.0.0:5601 0.0.0.0:*

 4. Generate a kibana dedicated token
 [root@elk191 ~]# /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana
 eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTkyLjE2OC4xMjEuMTkxOjkyMDAiXSwiZmdyIjoiZmNjMWI3MzJlNzIwMzMzMjI0ZDc5Zjk1YTUyZjIzZmUyNjMzMzYwZDIxY2Q0NzY3YjQ2ZjExZDhiOGYxZTFlZiIsImtleSI6IjdjNTk3SlVCeEI5S3NHd1ZPWVQ5OmYtN0FRWkhEUTVtMnlCZXdiMnJLbXcifQ==

 The server obtains the verification code
 [root@elk191 ~]# /usr/share/kibana/bin/kibana-verification-code
 Your verification code is: 414 756

es8 cluster deployment

1. Copy the configuration file to other nodes
 [root@elk191 ~]# scp elasticsearch-8.17. 10.0.0.192:~
 [root@elk191 ~]# scp elasticsearch-8.17. 10.0.0.193:~

 2. Install ES8 software packages from other nodes
 [root@elk192 ~]# dpkg -i elasticsearch-8.17.
 [root@elk193 ~]# dpkg -i elasticsearch-8.17.

 #Configuration es8
 [root@elk191 ~]# grep -Ev "^$|^#" /etc/elasticsearch/
 : xu-application
 : /var/lib/elasticsearch
 : /var/log/elasticsearch
 : 0.0.0.0
 discovery.seed_hosts: ["192.168.121.191","192.168.121.192","192.168.121.193"]
 cluster.initial_master_nodes: ["192.168.121.191","192.168.121.192","192.168.121.193"]
 : true
 : true
 :
   enabled: true
   : certs/http.p12
 :
   enabled: true
   verification_mode: certificate
   : certs/transport.p12
   : certs/transport.p12
 : 0.0.0.0

 3. Generate token token file in any node of the existing cluster
 [root@elk191 ~]# /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node

 4. Use token to reconfigure the configuration file of the new node when the node is to be joined
 [root@elk192 ~]# /usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token  eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTkyLjE2OC4xMjEuMTkxOjkyMDAiXSwiZmdyIjoiMzIwODY0YzMxNmEyMDQ4YmIwYzVjNDNhY2FlZjQ4MTg2OTM3MmVhNTg2NjdiYTAwMjBjN2Y2ZTczN2YzNWU0MCIsImtleSI6IkE3RTY4SlVCU1BhTWhMRFN0VWdlOmdaM0dIS0RNUndld3o3ZWM0Qk1ySEEifQ==
 [root@elk193 ~]# /usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token  eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTkyLjE2OC4xMjEuMTkxOjkyMDAiXSwiZmdyIjoiMzIwODY0YzMxNmEyMDQ4YmIwYzVjNDNhY2FlZjQ4MTg2OTM3MmVhNTg2NjdiYTAwMjBjN2Y2ZTczN2YzNWU0MCIsImtleSI6IkE3RTY4SlVCU1BhTWhMRFN0VWdlOmdaM0dIS0RNUndld3o3ZWM0Qk1ySEEifQ==

 5. Synchronize configuration files
 [root@elk191 ~]# scp /etc/elasticsearch/ 192.168.121.192:/etc/elasticsearch/
 [root@elk191 ~]# scp /etc/elasticsearch/ 192.168.121.193:/etc/elasticsearch/


 6. Start the client es
 [root@elk192 ~]# systemctl enable --now
 [root@elk193 ~]# systemctl enable --now

 7. Access Test
 [root@elk193 ~]# curl -u elastic:123456 -k https://192.168.121.191:9200/_cat/nodes
 192.168.121.191 17 97 10 0.61 0.55 0.72 cdfhilmrstw * elk191
 192.168.121.193 15 97 55 1.72 1.05 0.49 cdfhilmrstw - elk193
 192.168.121.192 13 97 4 0.25 0.45 0.52 cdfhilmrstw - elk192

Common Errors

1. Common error handling Q1:
 ERROR: Aborting enrolling to cluster. Unable to remove existing secure settings. Error was: Aborting enrolling to cluster. Unable to remove existing security configuration, did not contain expected setting [autoconfiguration.password_hash]., with exit code 74


 Problem analysis:
 It means that the local security configuration has been set.  Delete the previous configuration "".


 Solution:
 rm -f /etc/elasticsearch/




 Common error handling Q2:
 ERROR: Skipping security auto configuration because this node is configured to bootstrap or to join a multi-node cluster, which is not supported., with exit code 80


 Solution:
 export IS_UPGRADE=false


 Common error handling Q3:
 ERROR: Aborting enrolling to cluster. This node doesn't appear to be auto-configured for security. Expected configuration is missing from ., with exit code 64


 Error analysis:
 Check the configuration file and find that security-related configuration will be missing, which may be that synchronization has failed.

 Solution:
 Modify "/etc/elasticsearch/" and add security configuration. You can manually copy the configuration of the elk192 node to the node.

 If it still cannot be solved, you can compare the differences between the configuration of the elk191 node and the configuration file of the elk192, and then copy the corresponding configuration. I'm testing here that I'm missing the certs directory.


 [root@elk191 ~]# scp -rp /etc/elasticsearch/certs/ 10.0.0.192:/etc/elasticsearch/
 [root@elk191 ~]# scp /etc/elasticsearch/ 10.0.0.192:/etc/elasticsearch/
 [root@elk191 ~]# scp /etc/elasticsearch/ 10.0.0.192:/etc/elasticsearch/
 [root@elk191 ~]# scp -rp /etc/elasticsearch/ 10.0.0.192:/etc/elasticsearch/


 ERROR: Aborting enrolling to cluster. Unable to remove existing secure settings. Error was: Aborting enrolling to cluster. Unable to remove existing security configuration, did not contain expected setting [.secure_password]., with exit code 74

The difference between es8 and es7

- Comparative deployment of ES8 and ES7
	 1. ES8 has enabled https by default, and supports authentication and other functions;
	 2. ES8 has added the 'elasticsearch-reset-password' script, which is easier for elastic users to reset their passwords;
	 3. ES8 has added the 'elasticsearch-create-enrollment-token' script, which can create token information for components, such as the kibana component;
	 4. ES8 added kibana to add 'kibana-verification-code' to generate verification code.
	 Support more languages: English (default) "en", Chinese "zh-CN", Japanese "ja-JP", French "fr-FR"
	 The webUI is richer, supports AI assistants, manual index creation and other functions;
	 7. When deploying ES8 clusters, you need to use the 'elasticsearch-reconfigure-node' script to join the existing cluster. The default is the configuration of a single master node;

ES7 JVM Tuning

By default, half of the memory of the physical machine will be consumed
 [root@elk91 ~]# ps -ef | grep java | grep Xms
 elastic+ 10045 1 2 Mar14 ? 00:56:32 /usr/share/elasticsearch/jdk/bin/java ... -Xms1937m -Xmx1937m ...

 2. About the tuning principle of ES cluster
		 - The JVM size of the cluster should be half of the physical machine, but not more than 32GB;
		 - 2. For example, if your cluster memory is 32GB, the default should be 16GB, but if your physical machine is 128GB, it will also eat half of it by default, so we need to manually configure it to 32GB;

 3. Set ES memory to 256Mb
 [root@elk1 ~]# vim /etc/elasticsearch/
 [root@elk1 ~]# egrep "^-Xm[s|x]" /etc/elasticsearch/
 -Xms256m
 -Xmx256m

 4. Copy the configuration file and scroll to restart the ES7 cluster
 [root@elk1 ~]# scp /etc/elasticsearch/ 192.168.121.92:/etc/elasticsearch/
                                                                                                                                                 100% 3474 2.7MB/s 00:00
 [root@elk1 ~]# scp /etc/elasticsearch/ 192.168.121.93:/etc/elasticsearch/


 [root@elk1 ~]# systemctl restart
 [root@elk2 ~]# systemctl restart
 [root@elk3 ~]# systemctl restart

 5. Test verification
 [root@elk1 ~]# free -h
                total used free shared buff/cache available
 Mem: 3.8Gi 1.1Gi 1.9Gi 1.0Mi 800Mi 2.4Gi
 Swap: 3.8Gi 26Mi 3.8Gi
 [root@elk1 ~]# ps -ef | grep java | grep Xms
 -Xms256m -Xmx256m


 curl -k -u elastic:123456 https://127.1:9200/_cat/nodes
 192.168.121.92 68 67 94 4.01 2.12 0.96 cdfhilmrstw * elk2
 192.168.121.91 59 56 42 1.72 0.87 0.43 cdfhilmrstw - elk1
 192.168.121.93 63 61 92 3.30 2.26 1.14 cdfhilmrstw - elk3