Redshift has a limit of 500 connections. If we use Redshift as a data source of a user-facing application, we can easily exceed that limit. In this case, we end up with a terrible user experience and hard to debug problems (I bet that exceeded connection limit is not going to be the first possible cause we will look at while debugging).
Table of Contents
In this article, I show how to find idle Redshift sessions and how to terminate them. First, I will show you the full query that returns only the idle sessions. In the second part of the article, I will explain it step-by-step.
How to get idle user sessions
SELECT
sessions.process AS process_id,
trim(sessions.user_name) AS user_name,
sessions.starttime AS session_start_time,
datediff(s, sessions.starttime, sysdate) AS session_duration_is_seconds,
queries.last_end AS last_query_end_time,
datediff(s, CASE WHEN queries.last_end IS NOT NULL
THEN queries.last_end
ELSE sessions.starttime END
, sysdate) AS idle_duration_in_seconds
FROM
(SELECT starttime, process, u.usesysid, user_name
FROM stv_sessions s, pg_user u
WHERE s.user_name = u.usename
AND u.usesysid>1
AND process NOT IN (
SELECT pid FROM stv_inflight WHERE userid>1
UNION SELECT pid FROM stv_recents WHERE status != 'Done' AND userid>1
)) sessions
LEFT OUTER JOIN (
SELECT userid, pid, max(endtime) AS last_end FROM svl_statementtext
WHERE userid>1 AND sequence=0
GROUP BY 1,2) queries
ON sessions.usesysid = queries.userid AND sessions.process = queries.pid
WHERE (queries.last_end > sessions.starttime OR queries.last_end IS NULL)
ORDER BY idle_duration_in_seconds DESC;
How to terminate a session
To terminate an active user session, we need the process_id that identifies the session. When we have the process id, we can use the pg_terminate_backend( process_id )
function to disconnect the user from Redshift.
Want to build AI systems that actually work?
Download my expert-crafted GenAI Transformation Guide for Data Teams and discover how to properly measure AI performance, set up guardrails, and continuously improve your AI solutions like the pros.
How does it work?
Find active sessions
We start by retrieving all active user sessions. For that, we can use the stv_sessions
Redshift table:
SELECT starttime, process, u.usesysid, user_name
FROM stv_sessions s, pg_user u
WHERE s.user_name = u.usename
AND u.usesysid > 1
Find active queries
We are not interested in sessions of users who are currently executing a query, so we filter out in-flight queries and recently finished queries. Hence, the process NOT IN
statement that removes the sessions found by this query:
SELECT pid FROM stv_inflight WHERE userid>1
UNION
SELECT pid FROM stv_recents WHERE status != 'Done' AND userid>1
Which session is idle?
To figure out which sessions are idle, we need the end time of the last statement executed using those sessions. We can get this information from the slv_statementtext
table:
SELECT userid, pid, max(endtime) AS last_end
FROM svl_statementtext
WHERE userid>1 AND sequence=0
GROUP BY 1,2