Location>code7788 >text

HiveServer2 File Descriptor Leak

Popularity:785 ℃/2024-09-20 18:39:45

impunity

User feedback that the number of file descriptors opened by hs2 keeps going up, but current hs2 connections are only in the single digits.

 

investigation process

The first step is to find out which file descriptors are held by the hs2 process, by using thelsof command lsof -p $pid , see that the hs2 process is indeed in the/data/emr/hive/tmp/operation_logs/ A large number of descriptors are opened in the directory

Found a similar issue in jira.[HIVE-10970] Investigate HIVE-10453: HS2 leaking open file descriptors when using UDFs - ASF JIRA ()

But this scenario is an fd leak due to a UDF, and the leak path is in the path, which is not the same as the operation_logs directory. It doesn't seem to be the same problem

Examining the source code, I found that the operation log has a cleanup logic.
#cleanupOperationLog

The guess is that the client sessionAbnormal End This method does not have theCalled normally or cleanup logic is flawed.rounding difference

First, go through the logic of session closure, by analyzing the flame diagram of the beeline client to find the starting point of session closure.
#closeClientOperation
Pasted image

Here the client makes a thrift rpc call, and then finds the thrift server counterpart in hs2 thrift.#CloseOperation
Tracing this method will eventually lead you to the#close
The cleanupOperationLog method is called here.
Pasted image

Then it is possible that the operation logs are not cleaned up because the client session exited abnormally.

Then I looked at the cleanupOperationLog logic to see if there was a code bug here, so I used the git branching comparison feature in idea, and found that version 3.1 had committed a fix for it.

Pasted image

[HIVE-18820] Operation doesn't always clean up log4j for operation log - ASF JIRA ()

 

reach a verdict

  • The client session exits abnormally, resulting in operation logs not being cleaned up, similar to the scratch dir not being cleaned up scenario.
  • HIVE-18820 community bugs, consider merging into this patch.