Background of the incident
It started when a developer reported that the program was not working properly after the release, and was constantly in a state of being restarted.
At first I thought it was a problem with the program itself, but after checking the service logs, I did not find any errors in the program.
When checking the monitoring system, it was found that the CPU utilization of the server node had reached 100%, so it was no wonder that the program could no longer run. Moreover, it was found that there was more than one node with this situation, and several servers in the whole environment had 100% CPU utilization.
I. Viewing the process
Using the Top command to view the processes , you can see that the CPU utilization has run full. However, no abnormal processes are found in the process list. Except for some business programs that occupy more CPU, but the situation does not improve after shutting down.
II. Viewing network access
At this point, it was suspected that the machine had been compromised, so the following command was used to check the network connection.
netstat -an |grep ESTABLISHED
After looking at a few machines, it was found that the problematic machines all had an external network connection as shown below.
tcp 0 0 10.12.15.7:39410 86.107.101.103:7643 ESTABLISHED
Although each machine is connected to a different external IP address, the port number is uniformly 7643, and the address query reveals that they are all foreign addresses.
Since the servers in question do not have any foreign operations, it is certain that they have been invaded by a virus.
Third, view startup items
Use the following command to view the bootstrap entries
systemctl list-unit-files |grep enabled
In the startup items, I found a suspicious service named, which is suspected to be a virus. (Note: The service name of this virus is different in different machines, it is random. (Note: The name of the service varies from machine to machine and is randomized, but it is characterized by garbled codes with case or numbers.)
enabled
[email protected] enabled
enabled
enabled
enabled <-------
enabled
enabled
enabled
enabled
enabled
enabled
enabled
enabled
enabled
Check the startup status of the service and where the startup file is stored by using the following command.
systemctl status
Next, locate the startup file and view its contents.
$ cat /usr/lib/systemd/system/
[Unit]
Description=service
After=
[Service]
Type=simple
ExecStart=/bin/eWqAVtbn
RemainAfterExit=yes
Restart=always
RestartSec=60s
As you can see, the service calls a /bin/eWqAVtbn file at startup, which should be the execution file for the virus.
IV. Removal of viruses
After finding the virus file, we can now start to remove the virus.
Stop Virus Service
systemctl stop
systemctl disable
Delete related virus files
rm /bin/eWqAVtbn # Delete executable files
rm /usr/lib/systemd/system/ # Delete startup files
When the deletion is complete, restart the server.
After completing the above steps, the web link was viewed again and found to have disappeared. At the same time, the server CPU utilization returns to normal , and the virus is removed.
summarize
It is possible that the virus is a mining type of virus, which takes up machine resources for tasks and therefore causes CPU usage to skyrocket. Also, the virus is more cunning and has the following characteristics:
1. Hide your processes from being discovered by the TOP command.
2. Add bootstrap items to ensure that it will still take effect after restarting the server.
3. The file names are random and different on different machines, increasing the difficulty of troubleshooting.
Currently, viruses can be effectively removed by the methods documented in this document. Repeat poisonings have not been known to occur on treated machines.