transaction_timeout: terminate the session when the transaction timeout is reached.

Background description of functional realization

We already have two parameters to control long transactions:statement_timeout cap (a poem)idle_in_transaction_session_timeout. However, if the transaction executes a command that is short enough and not more thanstatement_timeoutand the pause between commands is appropriateidle_in_transaction_session_timeoutIn other words, the transaction may continue indefinitely.

In this case.transaction_timeout Ensures that the duration of the transaction does not exceed the specified timeout. If it is exceeded, the transaction and the session in which it was executed are terminated. As follows:

postgres=# select version();
                                                  version                                                   
------------------------------------------------------------------------------------------------------------
 PostgreSQL 18devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-21), 64-bit
(1 row)

postgres=# show statement_timeout ;
 statement_timeout 
-------------------
 0
(1 row)

postgres=# show transaction_timeout ;
 transaction_timeout 
---------------------
 0
(1 row)

postgres=# set transaction_timeout = '10s';
SET
postgres=# begin ;
BEGIN
postgres=*# select pg_sleep(2);
 pg_sleep 
----------
 
(1 row)

postgres=*# 2024-09-28 19:28:30.891 PDT [45558] FATAL:  terminating connection due to transaction timeout
postgres=*#

Then look at the process correlation, as follows:

在这里插入图片描述

As above.45875The process died and the session was disconnected. Then as I continued executing on psql, a new session was created. I'll explain this in more detail later (you can also take a look at the image below to think about it first):

在这里插入图片描述

I was a little confused when I first saw this for the following reasons:

在这里插入图片描述

I was thinking it was likestatement_timeoutThis way, there is no need to disconnect when the transaction times out, and the transaction fails (rollback is sufficient).

Then with that in mind, went and looked at the mailing list as follows:

Transaction timeout, click to go to

在这里插入图片描述

Note 1: Interested parties can check the mailing list themselves
Note 2: Next we look together at thetransaction_timeoutand why it can't be implemented internally likestatement_timeoutThis is how it's going to happen.

Function realization source code analysis

First look at the official documentation for the following explanation:

Terminates any session in a transaction that lasts longer than the specified time. This restriction applies both to explicit transactions (started with BEGIN) and to implicitly started transactions corresponding to a single statement.
If this value is specified with no units, it is in milliseconds. A value of zero (the default) will disable the timeout.

If transaction_timeout is shorter than or equal to idle_in_transaction_session_timeout or statement_timeout, the longer timeout is ignored.
It is not recommended to set transaction_timeout in as it affects all sessions.

This GUC parameter is defined, as follows:

// src/backend/utils/misc/guc_tables.c

	{
		{"transaction_timeout", PGC_USERSET, CLIENT_CONN_STATEMENT,
			gettext_noop("Sets the maximum allowed duration of any transaction within a session (not a prepared transaction)."),
			gettext_noop("A value of 0 turns off the timeout."),
			GUC_UNIT_MS
		},
		&TransactionTimeout,
		0, 0, INT_MAX,
		NULL, assign_transaction_timeout, NULL
	},

with idle_in_transaction_session_timeout

// src/backend/tcop/

		...
		/*
		 * (1) If we've reached idle state, tell the frontend we're ready for
		 * a new query.
		 *
		 * Note: this includes fflush()'ing the last of the prior output.
		 *
		 * This is also a good time to flush out collected statistics to the
		 * cumulative stats system, and to update the PS stats display.  We
		 * avoid doing those every time through the message loop because it'd
		 * slow down processing of batched messages, and because we don't want
		 * to report uncommitted updates (that confuses autovacuum).  The
		 * notification processor wants a call too, if we are not in a
		 * transaction block.
		 *
		 * Also, if an idle timeout is enabled, start the timer for that.
		 */
		if (send_ready_for_query)
		{
			if (IsAbortedTransactionBlockState())
			{
				set_ps_display("idle in transaction (aborted)");
				pgstat_report_activity(STATE_IDLEINTRANSACTION_ABORTED, NULL);

				/* Start the idle-in-transaction timer */
				if (IdleInTransactionSessionTimeout > 0
					&& (IdleInTransactionSessionTimeout < TransactionTimeout || TransactionTimeout == 0))
				{
					idle_in_transaction_timeout_enabled = true;
					enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
										 IdleInTransactionSessionTimeout);
				}
			}
			else if (IsTransactionOrTransactionBlock())
			{
				set_ps_display("idle in transaction");
				pgstat_report_activity(STATE_IDLEINTRANSACTION, NULL);

				/* Start the idle-in-transaction timer */
				if (IdleInTransactionSessionTimeout > 0
					&& (IdleInTransactionSessionTimeout < TransactionTimeout || TransactionTimeout == 0))
				{
					idle_in_transaction_timeout_enabled = true;
					enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
										 IdleInTransactionSessionTimeout);
				}
			}
			...

with statement_timeout

/*
 * Start statement timeout timer, if enabled.
 *
 * If there's already a timeout running, don't restart the timer.  That
 * enables compromises between accuracy of timeouts and cost of starting a
 * timeout.
 */
static void
enable_statement_timeout(void)
{
	/* must be within an xact */
	Assert(xact_started);

	if (StatementTimeout > 0
		&& (StatementTimeout < TransactionTimeout || TransactionTimeout == 0))
	{
		if (!get_timeout_active(STATEMENT_TIMEOUT))
			enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout);
	}
	else
	{
		if (get_timeout_active(STATEMENT_TIMEOUT))
			disable_timeout(STATEMENT_TIMEOUT, false);
	}
}

As above, whentransaction_timeout Less than or equal toidle_in_transaction_session_timeout maybestatement_timeout, then the longer timeout is ignored.

transaction_timeout

// src/backend/utils/init/

void
InitPostgres(const char *in_dbname, Oid dboid,
			 const char *username, Oid useroid,
			 bits32 flags,
			 char *out_dbname)
{
	...
	if (!bootstrap)
	{
		RegisterTimeout(DEADLOCK_TIMEOUT, CheckDeadLockAlert);
		RegisterTimeout(STATEMENT_TIMEOUT, StatementTimeoutHandler);
		RegisterTimeout(LOCK_TIMEOUT, LockTimeoutHandler);
		RegisterTimeout(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
						IdleInTransactionSessionTimeoutHandler);
		RegisterTimeout(TRANSACTION_TIMEOUT, TransactionTimeoutHandler); // here
		RegisterTimeout(IDLE_SESSION_TIMEOUT, IdleSessionTimeoutHandler);
		RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, ClientCheckTimeoutHandler);
		RegisterTimeout(IDLE_STATS_UPDATE_TIMEOUT,
						IdleStatsUpdateTimeoutHandler);
	}
	...
}

static void
TransactionTimeoutHandler(void)
{
	TransactionTimeoutPending = true;
	InterruptPending = true;
	SetLatch(MyLatch);
}

Next, here's how to modify the source code to useShowTransactionStatefunction to print, as follows:

[postgres@localhost:~/test/bin]$ ./psql 
INFO:  CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0
psql (18devel)
Type "help" for help.

postgres=# set transaction_timeout = '10s';
INFO:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
INFO:  CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0
SET
postgres=# begin;
INFO:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
BEGIN
postgres=*# commit;
INFO:  CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
COMMIT
postgres=# begin;
INFO:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
BEGIN
postgres=*# select pg_sleep(20);
2024-09-28 20:42:58.376 PDT [62092] FATAL:  terminating connection due to transaction timeout
2024-09-28 20:42:58.376 PDT [62092] STATEMENT:  select pg_sleep(20);
FATAL:  terminating connection due to transaction timeout
server closed the connection unexpectedly1
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: INFO:  CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0
Succeeded.
postgres=#

If any of you are interested in parent-child transaction finite state machines, you can check out my previous blog as follows:

PostgreSQL Learning and Knowledge Summary (145) | Deeper Understanding of PostgreSQL Databases - Using ShowTransactionState and Parent-Child Transaction Finite State Machines, click to go to

transaction_timeoutThe timeout is enabled/disabled as follows:

// src/backend/access/transam/

/*
 *	StartTransaction
 */
static void
StartTransaction(void)
{
	...
	/* Schedule transaction timeout */
	if (TransactionTimeout > 0)
		enable_timeout_after(TRANSACTION_TIMEOUT, TransactionTimeout);
	...
}

static void
CommitTransaction(void)
{
	...
	/* Disable transaction timeout */
	if (TransactionTimeout > 0)
		disable_timeout(TRANSACTION_TIMEOUT, false);
	...
}

static void
PrepareTransaction(void)
{
	...
	/* Disable transaction timeout */
	if (TransactionTimeout > 0)
		disable_timeout(TRANSACTION_TIMEOUT, false);
	...
}

static void
AbortTransaction(void)
{
	...
	/* Disable transaction timeout */
	if (TransactionTimeout > 0)
		disable_timeout(TRANSACTION_TIMEOUT, false);
	...
}

Let's debug it.transaction_timeoutThe first thing to look at in relation to theenable_timeout_afterThe settings are as follows:

在这里插入图片描述

Note the two time values, and the core settings below:

在这里插入图片描述

where the first parameter:ITIMER_REAL: In system-real time, it sends theSIGALRMSignal. If you are interested in this function, you can read this old man's blog, we will not repeat it here:

linux c setitimer usage description, click to go to the

Continue:

在这里插入图片描述

The function stack, at this point, is as follows:

TransactionTimeoutHandler()
handle_sig_alarm(int postgres_signal_arg)
wrapper_handler(int postgres_signal_arg)
.0!<signal handler called> (unknown origin:0)
.6!epoll_wait (unknown origin:0)
WaitEventSetWaitBlock(WaitEventSet * set, int cur_timeout, WaitEvent * occurred_events, int nevents)
WaitEventSetWait(WaitEventSet * set, long timeout, WaitEvent * occurred_events, int nevents, uint32 wait_event_info)
secure_read(Port * port, void * ptr, size_t len)
pq_recvbuf()
pq_getbyte()
SocketBackend(StringInfo inBuf)
ReadCommand(StringInfo inBuf)
PostgresMain(const char * dbname, const char * username)
BackendMain(char * startup_data, size_t startup_data_len)
postmaster_child_launch(BackendType child_type, char * startup_data, size_t startup_data_len, ClientSocket * client_sock)
BackendStartup(ClientSocket * client_sock)
ServerLoop()
PostmasterMain(int argc, char ** argv)
main(int argc, char ** argv)

as abovehandle_sig_alarmis 14, which is what the above signalSIGALRM

The next error is reported as follows:

在这里插入图片描述

Because the error level reported here isfatal error - abort process, the process exits as follows:

在这里插入图片描述

Signal processing during commissioning

Because the signal above isSIGALRMIf the timeout is sent as a signalSIGINT for exampleStatementTimeoutHandler、LockTimeoutHandleretc:

// src/backend/utils/init/

/*
 * STATEMENT_TIMEOUT handler: trigger a query-cancel interrupt.
 */
static void
StatementTimeoutHandler(void)
{
	int			sig = SIGINT;

	/*
	 * During authentication the timeout is used to deal with
	 * authentication_timeout - we want to quit in response to such timeouts.
	 */
	if (ClientAuthInProgress)
		sig = SIGTERM;

#ifdef HAVE_SETSID
	/* try to signal whole process group */
	kill(-MyProcPid, sig);
#endif
	kill(MyProcPid, sig);
}

/*
 * LOCK_TIMEOUT handler: trigger a query-cancel interrupt.
 */
static void
LockTimeoutHandler(void)
{
#ifdef HAVE_SETSID
	/* try to signal whole process group */
	kill(-MyProcPid, SIGINT);
#endif
	kill(MyProcPid, SIGINT);
}

Debugging is interrupted by these signals as follows:

在这里插入图片描述

These signals can be handled as follows, and will no longer affect gdb debugging, as follows:

在这里插入图片描述

If you are using vscode for debugging, you can set it up as follows:

在这里插入图片描述

Note: For more information on the handling and use of signals during debugging, you can look at Jianping's documentation as follows:

/2022/11/postgresql-debug-checkpointer/

Summary analysis of legacy issues

With the above laid out, let's look at the first question: why can't that GUC parameter be implemented like thestatement_timeoutThat way, as follows:

postgres=# set statement_timeout = '30s';
SET
postgres=# select pg_sleep(40);
2024-09-28 22:11:51.127 PDT [67675] ERROR:  canceling statement due to statement timeout
2024-09-28 22:11:51.127 PDT [67675] STATEMENT:  select pg_sleep(40);
ERROR:  canceling statement due to statement timeout
postgres=# 
postgres=# reset statement_timeout;
RESET
postgres=# show statement_timeout;
 statement_timeout 
-------------------
 0
(1 row)

postgres=# select pg_sleep(40);
^C2024-09-28 22:12:11.129 PDT [67675] ERROR:  canceling statement due to user request
2024-09-28 22:12:11.129 PDT [67675] STATEMENT:  select pg_sleep(40);
Cancel request sent
ERROR:  canceling statement due to user request
postgres=#

statement_timeoutTimeout, sendSIGINT Like the following manual executionCtrl + CAnd... Andtransaction_timeoutof the following:

postgres=# \d
        List of relations
 Schema | Name | Type  |  Owner   
--------+------+-------+----------
 public | t1   | table | postgres
(1 row)

postgres=# table t1;
 id 
----
(0 rows)

postgres=# set transaction_timeout = '30s';
SET
postgres=# begin ;
BEGIN
postgres=*# ^C
postgres=*# ^C
postgres=*# insert into t1 values (1);
INSERT 0 1
postgres=*# commit ;
COMMIT
postgres=# table t1 ;
 id 
----
  1
(1 row)

postgres=#

It's like the analytics inside the mailing list:secure_read() could not handle SIGINT signals, and there was no way to end the transaction by sending a SIGINT signal. The original author has since changed the implementation, so if you're interested, check out patch v4 and beyond!

Second question: when interacting with psql the old session breaks due to a transaction timeout, then how do I just create a new one? Below:

在这里插入图片描述

As above the psql process is still alive, the following is the RESTORE logic:

// src/bin/psql/

/* CheckConnection
 *
 * Verify that we still have a good connection to the backend, and if not,
 * see if it can be restored.
 *
 * Returns true if either the connection was still there, or it could be
 * restored successfully; false otherwise.  If, however, there was no
 * connection and the session is non-interactive, this will exit the program
 * with a code of EXIT_BADCONN.
 */
static bool
CheckConnection(void);