An In-Depth Look at RDS for MySQL Audit Logging Features and Principles

This article was shared from Huawei Cloud CommunityIntroduction to RDS for MySQL Audit Log Functions., authored by GaussDB Database.

1. Background

In production environments, when database failures or problems occur, operations and maintenance personnel need to quickly locate abnormal or high-risk SQL statements. At this point, audit logs can provide detailed records to help track who performed each database operation, when it was performed, and the affected data objects, thus greatly accelerating the troubleshooting and recovery process.

MySQL Enterprise Edition provides an Audit Log plug-in that allows fine-grained auditing of database operations. The plugin supports logging important operations such as user login, query execution, data modification, and so on. However, in MySQL Community Edition, only the plug-in interface definition and functional description related to audit logging are provided, and the native audit logging functionality is not supported.

To make up for this lack of functionality, Huawei Cloud RDS for MySQL implements the MySQL audit log feature by integrating Percona's open source audit log plug-in. This feature has been opened in RDS for MySQL version 5.7 and RDS for MySQL version 8.0, which meets users' needs for database security auditing while enhancing database compliance and availability.

In this paper, we will take RDS for MySQL as the research object, and introduce the function and analyze the principle of audit log.

2. Description of functional parameters

When the audit logging feature is enabled on RDS for MySQL, users can view the variable names and parameter values related to the audit logging feature by using the SHOW variables LIKE 'audit%'; statement.

mysql> SHOW variables LIKE 'audit%';
+-------------------------------------+---------------+
| Variable_name                       | Value         |
+-------------------------------------+---------------+
| audit_log_anonymized_ip             |               |
| audit_log_buffer_size               | 1048576       |
| audit_log_csv2_escape               | OFF           |
| audit_log_csv2_old_separated_format | OFF           |
| audit_log_csv2_truncation           | ON            |
| audit_log_exclude_accounts          |               |
| audit_log_exclude_commands          |               |
| audit_log_exclude_databases         |               |
| audit_log_file                      |      |
| audit_log_flush                     | OFF           |
| audit_log_force_rotate              | OFF           |
| audit_log_format                    | CSV2          |
| audit_log_handler                   | FILE          |
| audit_log_include_accounts          |               |
| audit_log_include_commands          |               |
| audit_log_include_databases         |               |
| audit_log_policy                    | ALL           |
| audit_log_rotate_on_size            | 104857600     |
| audit_log_rotations                 | 50            |
| audit_log_strategy                  | ASYNCHRONOUS  |
| audit_log_syslog_facility           | LOG_USER      |
| audit_log_syslog_ident              | percona-audit |
| audit_log_syslog_priority           | LOG_INFO      |
+-------------------------------------+---------------+
23 rows in set (0.01 sec)

These parameters control the overall functionality of the Audit Log Plug-in and allow the user to flexibly adjust all aspects of the audit log. By setting and adjusting these parameters appropriately, the user can precisely determine the logging scope, logging level, storage method, etc. For example, you can determine the logging level of the audit log by audit_log_policy, and also adjust the audit log refresh strategy by changing audit_log_strategy.

In RDS for MySQL, some variables are not open for modification due to security compliance considerations. The following table describes the roles and default values of the variables in the audit log function.

Table 1 Description of audit log variables

3. Log content parsing

On RDS for MySQL, the default value of audit_log_policy is ALL. at this level, the audit log records database activities including DML (Data Manipulation Language), DDL (Data Definition Language), and DCL (Data Control Language) operations, as well as connecting or disconnecting. It is important to note that different types of activities contain different log fields. The following is a content analysis of audit logs generated from common DDL, DML, DCL operations and database connections and disconnections.

DDL, DML and DCL

For DDL, DML and DCL operations, the format of the generated audit logs is the same. For the specific meanings of the log fields are detailed below.

RECORD ID: Audit log unique ID, used to identify each audit log.
STATUS: Status code, non-zero means ERROR.
NAME: operation command classification, QUERY, EXECUTE, QUIT, CONNECT and so on.
TIMESTAMP: The timestamp at which the logging occurred.
COMMAND_CLASS: Records the type of DDL, DML, and DCL operations, SELECT, INSERT, DELETE, and so on.
SQLTEXT: The content of the executed SQL statement.
USER: The user name to connect to the database.
HOST: The name of the host to which the database is connected.
IP: IP address of the client connecting to the database.
DATABASE: The name of the database specified for the connection.

Connect and Disconnect

Connection or disconnection events are logged when a user logs in successfully or fails to log in. Unlike the audit logs generated by DDL and DDL operations, the audit logs generated by connection events have additional fields such as PRIV_USER, OS_LOGIN, and so on, and the following is an analysis of the audit logs generated by connecting or disconnecting.

RECORD ID: Audit log unique ID, used to identify each audit log.
STATUS: Status code, non-zero means ERROR.
NAME: operation command classification, QUERY, EXECUTE, QUIT, CONNECT and so on.
TIMESTAMP: the timestamp at which the logging occurred
USER: The user name to connect to the database.
PRIV_USER: Authenticated user name.
OS_LOGIN: external user name.
PROXY_USER: The proxy user name.
HOST: The name of the host to which the database is connected.
IP: IP address of the client connecting to the database.
DATABASE: The name of the database specified for the connection.

By parsing the contents of the audit logs, users can not only quickly view the activity status of the database at any time period, but also accurately understand the details of each SQL statement, including the user who executed it, the timestamp, the type of query, and other key information. Such detailed records provide strong support for security review, problem troubleshooting and performance optimization.

for MySQL Audit Logging Principles

The core of RDS for MySQL auditing function is to complete the logging of corresponding types of logs through different types of event-driven audit logging plug-ins. A total of two types of events are supported in RDS for MySQL, namely general events and connection events. General events can be understood as DDL, DML and DCL statements executed by users. Connection events, on the other hand, are connecting to the database (Connect) and disconnecting from the database (Disconnect). The relevant code for the Audit Log Plugin to support event definitions is as follows.

static int is_event_class_allowed_by_policy(mysql_event_class_t event_class,num audit_log_policy_t policy) {
  static unsigned int class_mask[] = {
      /* ALL */
      (1 << MYSQL_AUDIT_GENERAL_CLASS) | (1 << MYSQL_AUDIT_CONNECTION_CLASS),
      0,                                   /* NONE */
      (1 << MYSQL_AUDIT_CONNECTION_CLASS), /* LOGINS */
      (1 << MYSQL_AUDIT_GENERAL_CLASS),    /* QUERIES */
  };

  return (class_mask[policy] & (1 << event_class)) != 0;
}

When an auditable event occurs, the server invokes the relevant auditing interface in order to pass information about the event to the registered audit logging plug-in, ensuring that the auditing plug-in is able to receive and process the event if necessary.

The entry function for the audit logging functionality in the RDS for MySQL kernel is mysql_audit_notify. drives the audit logging plugin through the corresponding events. The main workflow call stack is shown below.

  do_command
    ->dispatch_commnad
        ->mysql_audit_notify
            ->event_class_dispatch
                  // Check if the current plugin needs to handle this event
                ->plugin_dispatch
                    // Dispatch audit tasks by event class
                    ->audit_log_notify

When the database kernel receives a request for SQL execution, it first processes the connection through the do_command function. After processing, the dispatch_command function distributes commands based on different SQL types. After that, the audit entry function mysql_audit_notify completes the preparation and validation of the audit log before recording. If the verification passes, the subsequent work will be done by other functions in the registered audit log plugin.

As can be seen through the function call stack, a high degree of decoupling between the audit logging functionality and the database kernel has been realized. The two are interfaced through pre-registered function interfaces, and this design improves the flexibility of future functionality extensions.

Audit log plug-in validation and resource preparation work by mysql_audit_acquire_plugins function to complete, when the function completes the validation, that is, the audit log plug-in has completed the registration and the relevant resources have been completed binding, the next will be set by the audit_log_notify function in accordance with the type of event and related parameters to complete the task of distribution. audit_log_notify will call audit_log_write function to complete the audit log write, audit_log_write will be based on the value of the audit_log_handler variable to determine whether to write the audit log file or the system log.

If you are writing to a log file, you will also determine the log writing strategy at this time; if audit_log_strategy is PERFORMANCE or ASYNCHRONOUS, you will call the audit_handle_file_write_buf function to write the log contents to the audit log plug-in's buffer, otherwise you will call the audit_handle_file_write_nobuf function to write the log contents directly to the operating system file cache.

If the value of audit_log_handler is SYSLOG, it means that the audit log is written directly to the syslog. Then the writing of the log to the system log will be accomplished by calling audit_handler_syslog_write. The internal function call flow of audit log is shown in the figure.

Figure 1 Audit log plug-in workflow diagram

As you can see from the figure, there are two main ways to drop the audit log, namely, completing the log drop through the file system and using the audit log swipe thread to continuously write the logs from the buffer to the disk. The buffer of the audit log is completed when the audit log plug-in is initialized to allocate the relevant resources, and its structure is as follows:

struct audit_log_buffer {
  // Buffer contents
  char *buf; // buffer size
  // Size of the buffer
  size_t size; // write log location
  // Write log position
  size_t write_pos; // write log position
  // write log position; // write_pos; // write_position
  size_t flush_pos;
  // Flush worker thread
  pthread_t flush_worker_thread; // Whether the buffer is paused or not.
  // Whether the buffer is paused.
  int stop; // if the buffer is paused; // if the buffer is full or not.
  // if the buffer is full; // if the log is dropped
  int drop_if_full;
  void *write_func_data; void *write_func_data.
  audit_log_write_func write_func;
  int drop_if_full; void *write_func_data
  mysql_cond_t flushed_cond; mysql_cond_t flushed_cond; mysql_cond_t flushed_cond
  mysql_cond_t flushed_cond; mysql_cond_t written_cond; mysql_record_state_t
  log_record_state_t state; log_record_state_t state; log_record_state_t state
};

As you can see from the audit log buffer structure, the log buffer accomplishes its core functions mainly through the log write function and the log drop thread.

When the log needs to be written to a buffer, the length of the log and the size of the buffer are first compared. If the length of the audit log is greater than the size of the log buffer, and the buffer is full and you choose not to discard the log, the audit log drop thread suspends its work and bypasses the log's buffer and writes directly to the file buffer. When the length of the audit log is smaller than the size of the log buffer, the contents of the log are copied to the file buffer, and the buffer write_pos and other relevant parameters are updated, waiting for the log drop thread to work. If the current write position is more than half the size of the entire buffer, it will immediately notify the drop thread to complete the drop of the audit log.

Audit log dumping is mainly done by the log dumping worker thread. If the log buffer is not closed and there are still logs in the buffer that have not yet been dropped, the log drop function will be called cyclically to write the log.

static void *audit_log_flush_worker(void *arg) {
  audit_log_buffer_t *log = (audit_log_buffer_t *)arg;
  // Thread initialization
  my_thread_init();
  // If the log buffer is not closed and there are currently logs left on the disk
  while (! (log->stop && log->flush_pos == log->write_pos)) {
    // Do the log flushing
    audit_log_flush(log);
  }
  // Close the thread
  my_thread_end();

  return nullptr.
}

For the log drop function, will be through the loop to determine write_pos and flush_pos is equal, if the two are equal and the log buffer does not stop working, then it will wait for 1 second to enter the loop; if the two are not equal, it means that there are new logs in the buffer need to drop disk.

At this point, if write_pos is larger than the buffer size, the log plug-in will drop all the logs after flush_pos and set the status of the current log to LOG_RECORD_INCOMPLETE. if write_pos is within the buffer size, it will write the complete log to the audit log file and set the status of the log to LOG_RECORD_COMPLETE. LOG_RECORD_COMPLETE. the values of the buffer-related variables are synchronously updated after the log drop operation is completed, and the log is prepared for the next log drop.

5. Instructions for use

1) Turn on audit logging for database instances

Log in to the Management Console, in the "Instance Management" page of Cloud RDS for MySQL, click the target instance name to enter the basic information page. In the left navigation bar, click "SQL Audit", click the "Enable" button on the right side of "SQL Audit", and then click "OK" in the pop-up box to turn on the audit log switch.

Figure 2 Setting the Audit Log Function

2) Download of audit logs

With audit logging turned on, database related activities are recorded in OBS in the form of logs, and users can download audit logs from the console interface.

Figure 3 Console interface to download audit logs

6. Summary

The audit logging feature of RDS for MySQL plays an important role in user activity monitoring, privilege change tracking and performance optimization. It not only helps organizations improve overall database security and meet increasingly stringent compliance requirements, but also provides valuable information during troubleshooting. This feature accurately records user database operations, helps identify potential security threats, and provides detailed data for performance bottleneck analysis and optimization.

Huawei Developer Space, which gathers development resources and tools from various root technologies such as Hongmeng, Rise, Kunpeng, GaussDB, and Euler, is committed to providing each developer with a cloud host, a set of development tools, and storage space on the cloud, so that developers can innovate based on Huawei's root ecosystem.

Click on the link to get your exclusive cloud hosting for free~!