Background description of functional realization
We already have two parameters to control long transactions:statement_timeout
cap (a poem)idle_in_transaction_session_timeout
. However, if the transaction executes a command that is short enough and not more thanstatement_timeout
and the pause between commands is appropriateidle_in_transaction_session_timeout
In other words, the transaction may continue indefinitely.
In this case.transaction_timeout
Ensures that the duration of the transaction does not exceed the specified timeout. If it is exceeded, the transaction and the session in which it was executed are terminated. As follows:
postgres=# select version();
version
------------------------------------------------------------------------------------------------------------
PostgreSQL 18devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-21), 64-bit
(1 row)
postgres=# show statement_timeout ;
statement_timeout
-------------------
0
(1 row)
postgres=# show transaction_timeout ;
transaction_timeout
---------------------
0
(1 row)
postgres=# set transaction_timeout = '10s';
SET
postgres=# begin ;
BEGIN
postgres=*# select pg_sleep(2);
pg_sleep
----------
(1 row)
postgres=*# 2024-09-28 19:28:30.891 PDT [45558] FATAL: terminating connection due to transaction timeout
postgres=*#
Then look at the process correlation, as follows:
As above.45875
The process died and the session was disconnected. Then as I continued executing on psql, a new session was created. I'll explain this in more detail later (you can also take a look at the image below to think about it first):
I was a little confused when I first saw this for the following reasons:
I was thinking it was likestatement_timeout
This way, there is no need to disconnect when the transaction times out, and the transaction fails (rollback is sufficient).
Then with that in mind, went and looked at the mailing list as follows:
- Transaction timeout, click to go to
Note 1: Interested parties can check the mailing list themselves
Note 2: Next we look together at thetransaction_timeout
and why it can't be implemented internally likestatement_timeout
This is how it's going to happen.
Function realization source code analysis
First look at the official documentation for the following explanation:
Terminates any session in a transaction that lasts longer than the specified time. This restriction applies both to explicit transactions (started with BEGIN) and to implicitly started transactions corresponding to a single statement.
If this value is specified with no units, it is in milliseconds. A value of zero (the default) will disable the timeout.
If transaction_timeout is shorter than or equal to idle_in_transaction_session_timeout or statement_timeout, the longer timeout is ignored.
It is not recommended to set transaction_timeout in as it affects all sessions.
This GUC parameter is defined, as follows:
// src/backend/utils/misc/guc_tables.c
{
{"transaction_timeout", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the maximum allowed duration of any transaction within a session (not a prepared transaction)."),
gettext_noop("A value of 0 turns off the timeout."),
GUC_UNIT_MS
},
&TransactionTimeout,
0, 0, INT_MAX,
NULL, assign_transaction_timeout, NULL
},
with idle_in_transaction_session_timeout
// src/backend/tcop/
...
/*
* (1) If we've reached idle state, tell the frontend we're ready for
* a new query.
*
* Note: this includes fflush()'ing the last of the prior output.
*
* This is also a good time to flush out collected statistics to the
* cumulative stats system, and to update the PS stats display. We
* avoid doing those every time through the message loop because it'd
* slow down processing of batched messages, and because we don't want
* to report uncommitted updates (that confuses autovacuum). The
* notification processor wants a call too, if we are not in a
* transaction block.
*
* Also, if an idle timeout is enabled, start the timer for that.
*/
if (send_ready_for_query)
{
if (IsAbortedTransactionBlockState())
{
set_ps_display("idle in transaction (aborted)");
pgstat_report_activity(STATE_IDLEINTRANSACTION_ABORTED, NULL);
/* Start the idle-in-transaction timer */
if (IdleInTransactionSessionTimeout > 0
&& (IdleInTransactionSessionTimeout < TransactionTimeout || TransactionTimeout == 0))
{
idle_in_transaction_timeout_enabled = true;
enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
IdleInTransactionSessionTimeout);
}
}
else if (IsTransactionOrTransactionBlock())
{
set_ps_display("idle in transaction");
pgstat_report_activity(STATE_IDLEINTRANSACTION, NULL);
/* Start the idle-in-transaction timer */
if (IdleInTransactionSessionTimeout > 0
&& (IdleInTransactionSessionTimeout < TransactionTimeout || TransactionTimeout == 0))
{
idle_in_transaction_timeout_enabled = true;
enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
IdleInTransactionSessionTimeout);
}
}
...
with statement_timeout
/*
* Start statement timeout timer, if enabled.
*
* If there's already a timeout running, don't restart the timer. That
* enables compromises between accuracy of timeouts and cost of starting a
* timeout.
*/
static void
enable_statement_timeout(void)
{
/* must be within an xact */
Assert(xact_started);
if (StatementTimeout > 0
&& (StatementTimeout < TransactionTimeout || TransactionTimeout == 0))
{
if (!get_timeout_active(STATEMENT_TIMEOUT))
enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout);
}
else
{
if (get_timeout_active(STATEMENT_TIMEOUT))
disable_timeout(STATEMENT_TIMEOUT, false);
}
}
As above, whentransaction_timeout
Less than or equal toidle_in_transaction_session_timeout
maybestatement_timeout
, then the longer timeout is ignored.
transaction_timeout
// src/backend/utils/init/
void
InitPostgres(const char *in_dbname, Oid dboid,
const char *username, Oid useroid,
bits32 flags,
char *out_dbname)
{
...
if (!bootstrap)
{
RegisterTimeout(DEADLOCK_TIMEOUT, CheckDeadLockAlert);
RegisterTimeout(STATEMENT_TIMEOUT, StatementTimeoutHandler);
RegisterTimeout(LOCK_TIMEOUT, LockTimeoutHandler);
RegisterTimeout(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
IdleInTransactionSessionTimeoutHandler);
RegisterTimeout(TRANSACTION_TIMEOUT, TransactionTimeoutHandler); // here
RegisterTimeout(IDLE_SESSION_TIMEOUT, IdleSessionTimeoutHandler);
RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, ClientCheckTimeoutHandler);
RegisterTimeout(IDLE_STATS_UPDATE_TIMEOUT,
IdleStatsUpdateTimeoutHandler);
}
...
}
static void
TransactionTimeoutHandler(void)
{
TransactionTimeoutPending = true;
InterruptPending = true;
SetLatch(MyLatch);
}
Next, here's how to modify the source code to useShowTransactionState
function to print, as follows:
[postgres@localhost:~/test/bin]$ ./psql
INFO: CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0
psql (18devel)
Type "help" for help.
postgres=# set transaction_timeout = '10s';
INFO: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
INFO: CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0
SET
postgres=# begin;
INFO: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
BEGIN
postgres=*# commit;
INFO: CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
COMMIT
postgres=# begin;
INFO: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
BEGIN
postgres=*# select pg_sleep(20);
2024-09-28 20:42:58.376 PDT [62092] FATAL: terminating connection due to transaction timeout
2024-09-28 20:42:58.376 PDT [62092] STATEMENT: select pg_sleep(20);
FATAL: terminating connection due to transaction timeout
server closed the connection unexpectedly1
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: INFO: CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0
Succeeded.
postgres=#
If any of you are interested in parent-child transaction finite state machines, you can check out my previous blog as follows:
- PostgreSQL Learning and Knowledge Summary (145) | Deeper Understanding of PostgreSQL Databases - Using ShowTransactionState and Parent-Child Transaction Finite State Machines, click to go to
transaction_timeout
The timeout is enabled/disabled as follows:
// src/backend/access/transam/
/*
* StartTransaction
*/
static void
StartTransaction(void)
{
...
/* Schedule transaction timeout */
if (TransactionTimeout > 0)
enable_timeout_after(TRANSACTION_TIMEOUT, TransactionTimeout);
...
}
static void
CommitTransaction(void)
{
...
/* Disable transaction timeout */
if (TransactionTimeout > 0)
disable_timeout(TRANSACTION_TIMEOUT, false);
...
}
static void
PrepareTransaction(void)
{
...
/* Disable transaction timeout */
if (TransactionTimeout > 0)
disable_timeout(TRANSACTION_TIMEOUT, false);
...
}
static void
AbortTransaction(void)
{
...
/* Disable transaction timeout */
if (TransactionTimeout > 0)
disable_timeout(TRANSACTION_TIMEOUT, false);
...
}
Let's debug it.transaction_timeout
The first thing to look at in relation to theenable_timeout_after
The settings are as follows:
Note the two time values, and the core settings below:
where the first parameter:ITIMER_REAL
: In system-real time, it sends theSIGALRM
Signal. If you are interested in this function, you can read this old man's blog, we will not repeat it here:
- linux c setitimer usage description, click to go to the
Continue:
The function stack, at this point, is as follows:
TransactionTimeoutHandler()
handle_sig_alarm(int postgres_signal_arg)
wrapper_handler(int postgres_signal_arg)
.0!<signal handler called> (unknown origin:0)
.6!epoll_wait (unknown origin:0)
WaitEventSetWaitBlock(WaitEventSet * set, int cur_timeout, WaitEvent * occurred_events, int nevents)
WaitEventSetWait(WaitEventSet * set, long timeout, WaitEvent * occurred_events, int nevents, uint32 wait_event_info)
secure_read(Port * port, void * ptr, size_t len)
pq_recvbuf()
pq_getbyte()
SocketBackend(StringInfo inBuf)
ReadCommand(StringInfo inBuf)
PostgresMain(const char * dbname, const char * username)
BackendMain(char * startup_data, size_t startup_data_len)
postmaster_child_launch(BackendType child_type, char * startup_data, size_t startup_data_len, ClientSocket * client_sock)
BackendStartup(ClientSocket * client_sock)
ServerLoop()
PostmasterMain(int argc, char ** argv)
main(int argc, char ** argv)
as abovehandle_sig_alarm
is 14, which is what the above signalSIGALRM
The next error is reported as follows:
Because the error level reported here isfatal error - abort process
, the process exits as follows:
Signal processing during commissioning
Because the signal above isSIGALRM
If the timeout is sent as a signalSIGINT
for exampleStatementTimeoutHandler、LockTimeoutHandler
etc:
// src/backend/utils/init/
/*
* STATEMENT_TIMEOUT handler: trigger a query-cancel interrupt.
*/
static void
StatementTimeoutHandler(void)
{
int sig = SIGINT;
/*
* During authentication the timeout is used to deal with
* authentication_timeout - we want to quit in response to such timeouts.
*/
if (ClientAuthInProgress)
sig = SIGTERM;
#ifdef HAVE_SETSID
/* try to signal whole process group */
kill(-MyProcPid, sig);
#endif
kill(MyProcPid, sig);
}
/*
* LOCK_TIMEOUT handler: trigger a query-cancel interrupt.
*/
static void
LockTimeoutHandler(void)
{
#ifdef HAVE_SETSID
/* try to signal whole process group */
kill(-MyProcPid, SIGINT);
#endif
kill(MyProcPid, SIGINT);
}
Debugging is interrupted by these signals as follows:
These signals can be handled as follows, and will no longer affect gdb debugging, as follows:
If you are using vscode for debugging, you can set it up as follows:
Note: For more information on the handling and use of signals during debugging, you can look at Jianping's documentation as follows:
- /2022/11/postgresql-debug-checkpointer/
Summary analysis of legacy issues
With the above laid out, let's look at the first question: why can't that GUC parameter be implemented like thestatement_timeout
That way, as follows:
postgres=# set statement_timeout = '30s';
SET
postgres=# select pg_sleep(40);
2024-09-28 22:11:51.127 PDT [67675] ERROR: canceling statement due to statement timeout
2024-09-28 22:11:51.127 PDT [67675] STATEMENT: select pg_sleep(40);
ERROR: canceling statement due to statement timeout
postgres=#
postgres=# reset statement_timeout;
RESET
postgres=# show statement_timeout;
statement_timeout
-------------------
0
(1 row)
postgres=# select pg_sleep(40);
^C2024-09-28 22:12:11.129 PDT [67675] ERROR: canceling statement due to user request
2024-09-28 22:12:11.129 PDT [67675] STATEMENT: select pg_sleep(40);
Cancel request sent
ERROR: canceling statement due to user request
postgres=#
statement_timeout
Timeout, sendSIGINT
Like the following manual executionCtrl + C
And... Andtransaction_timeout
of the following:
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | t1 | table | postgres
(1 row)
postgres=# table t1;
id
----
(0 rows)
postgres=# set transaction_timeout = '30s';
SET
postgres=# begin ;
BEGIN
postgres=*# ^C
postgres=*# ^C
postgres=*# insert into t1 values (1);
INSERT 0 1
postgres=*# commit ;
COMMIT
postgres=# table t1 ;
id
----
1
(1 row)
postgres=#
It's like the analytics inside the mailing list:secure_read
() could not handle SIGINT signals, and there was no way to end the transaction by sending a SIGINT signal. The original author has since changed the implementation, so if you're interested, check out patch v4 and beyond!
Second question: when interacting with psql the old session breaks due to a transaction timeout, then how do I just create a new one? Below:
As above the psql process is still alive, the following is the RESTORE logic:
// src/bin/psql/
/* CheckConnection
*
* Verify that we still have a good connection to the backend, and if not,
* see if it can be restored.
*
* Returns true if either the connection was still there, or it could be
* restored successfully; false otherwise. If, however, there was no
* connection and the session is non-interactive, this will exit the program
* with a code of EXIT_BADCONN.
*/
static bool
CheckConnection(void);