How to detect saturation before the user notices it (Odoo + PgBouncer + PostgreSQL)

December 1, 2025 by

John Wolf

| No comments yet

The difference between "it got slow" and "it was an incident" iswhen you find out.

If you wait for the user to complain, you're already late. The good news: in a stackOdoo → PgBouncer → PostgreSQL, saturation leaves clear tracesminutes before.

This post gives you a simple system:4 early signals+ practical thresholds + what to do based on the pattern.

1) Early signal #1: queue appears in PgBouncer

What to look for

In the PgBouncer admin console:

SHOW POOLS;

The two indicators that signal the problem are:

cl_waiting > 0 sustained
maxwait rising (the oldest client has been waiting)

Interpretation:your app is trying to start transactions, but PgBouncer cannot assign them a "server" connection quickly enough. That is "saturation" in real time, even if there are still no visible errors.

Useful threshold

Warning:cl_waiting > 0 for 60–120s
Critical:maxwait > 1s sustained (already affects UX), > 5s is an incident

Immediate action

If there is a queue: don't guess. Follow step 4 (classify cause).

2) Early signal #2: the percentage of long transactions is increasing

What to look for (PostgreSQL)

Quick query to see long sessions and transactions:

SELECT pid, usename, state, xact_start, query_start, wait_event_type, wait_event, query
FROM pg_stat_activity
WHERE datname = current_database()
ORDER BY xact_start NULLS LAST;

Warning signs

old xact_start (transactions > 60–120s during peak hours)
many sessions with wait_event_type = Lock

Interpretation:even if the CPU is "ok", long transactions hijack concurrency (and with PgBouncer, they hijack server connections).

Useful threshold

Warning:1–3 transactions > 2 min during load hours
Critical:transactions > 5–10 min (almost always blocking/monster cron)

3) Early signal #3: throughput drops but demand does not (the "slow death")

What to look for

Requests/second (or jobs/second) vs latency
PgBouncer SHOW STATS; (if you are collecting it)
Odoo: p95/p99 latency by endpoint (login, listing, write, confirmations, reports)

Classic pattern before the incident

p95 rises slowly
p99 spikes first
throughput does not rise (or falls) even with normal traffic

Interpretation:you are no longer "scaling" with load. You are in contention.

Useful threshold

Warning:p99 > 2–3x your baseline
Critical:timeout errors or massive retries

4) Early signal #4: crons start to overlap (and no one is watching)

This is brutal in Odoo.

What to look for

duration of heavy crons
actual execution time vs expected
if they are overlapping (especially if you have max_cron_threads > 1)

Pattern before the incident

cron A takes longer → cron B starts the same → both compete for locks and DB
PgBouncer starts queuing
users notice slowness "in waves"

Useful threshold

Warning:cron that goes from X min to 2X min repeatedly
Critical:backlog (crons do not finish before their next run)

5) The key classification: 3 types of saturation (and what to do)

When you detect early saturation, classify it into one of these 3.

This avoids the typical mistake of "just increase pool_size and that's it".

Type A — Saturation bypool(config/concurrency)

Symptoms

cl_waiting rises
sv_idle ~ 0
Postgres is NOT at 100%
there are no major locks, just "a lot of movement"

Actions

carefully increase default_pool_size
add reserve_pool_size for spikes
check if max_client_conn or max_db_connections are limiting you

Type B — Saturation bylocks / long transactions

Symptoms

cl_waiting rises
maxwait rises sustained
in Postgres you see wait_event_type = Lock or very old xact_start
CPU not necessarily high (it's contention)

Actions

identify the long transaction (job/cron/user action)
short duration: batching, batch commits, avoid external I/O in the transaction
add lock_timeout, statement_timeout, and idle_in_transaction_session_timeout (according to policy)

Type C — Resource Saturation(CPU/RAM/I/O) (CPU/RAM/I/O)

Symptoms

CPU at 100% or high I/O wait
latency rises everywhere
PgBouncer may show a queue, but the root problem is the host/DB

Actions

optimize queries/indexes
reduce concurrency (workers/crons) to decrease contention
improve disk/IOPS
check for bloat/autovacuum if performance drops over time

6) A minimum set of "fire-fighting" alerts

If you could only create 6 alerts, they would be these:

PgBouncer

cl_waiting > 0 for 2 min
maxwait > 1s for 2 min

PostgreSQL

transactions > 2 min (count > N)
sessions waiting for locks > N

Odoo / app

p99 latency > 2–3x baseline
error rate (timeouts/5xx) > baseline

7) The trick that buys you time: "alert on trend", not on drop

Many monitor "CPU > 90%". That comes too late.

What buys you time is alerting onbehavior change:

p99 rises 30–50% compared to the baseline
maxwait goes from 0 to 0.5s and keeps rising
crons start to last 2x

This happens before the user feels the pain.

Close

If you want to detect saturation before the user does:

measure queue in PgBouncer,
measure long transactions and locks in Postgres,
measure p95/p99 in Odoo,
andmonitor cronsas if they were users (because they are, but more dangerous).

Next chapter ->

John Wolf December 1, 2025

How to detect saturation before the user notices it (Odoo + PgBouncer + PostgreSQL)

1) Early signal #1: queue appears in PgBouncer

What to look for

Useful threshold

2) Early signal #2: the percentage of long transactions is increasing

What to look for (PostgreSQL)

Useful threshold

3) Early signal #3: throughput drops but demand does not (the "slow death")

What to look for

Useful threshold

4) Early signal #4: crons start to overlap (and no one is watching)

What to look for

Useful threshold

5) The key classification: 3 types of saturation (and what to do)

Type A — Saturation bypool(config/concurrency)

Type B — Saturation bylocks / long transactions

Type C — Resource Saturation(CPU/RAM/I/O) (CPU/RAM/I/O)

6) A minimum set of "fire-fighting" alerts

PgBouncer

PostgreSQL

Odoo / app

7) The trick that buys you time: "alert on trend", not on drop

Close

Share this post

Tags

Archive

Síganos