Skip to Content

How to detect saturation before the user notices it (Odoo + PgBouncer + PostgreSQL)

December 1, 2025 by
How to detect saturation before the user notices it (Odoo + PgBouncer + PostgreSQL)
John Wolf
| No comments yet

The difference between "it got slow" and "it was an incident" iswhen you find out.

If you wait for the user to complain, you're already late. The good news: in a stackOdoo → PgBouncer → PostgreSQL, saturation leaves clear tracesminutes before.

This post gives you a simple system:4 early signals+ practical thresholds + what to do based on the pattern.


1) Early signal #1: queue appears in PgBouncer

What to look for

In the PgBouncer admin console:

SHOW POOLS;

The two indicators that signal the problem are:

  • cl_waiting > 0 sustained

  • maxwait rising (the oldest client has been waiting)

Interpretation:your app is trying to start transactions, but PgBouncer cannot assign them a "server" connection quickly enough. That is "saturation" in real time, even if there are still no visible errors.

Useful threshold

  • Warning:cl_waiting > 0 for 60–120s

  • Critical:maxwait > 1s sustained (already affects UX), > 5s is an incident

Immediate action

  • If there is a queue: don't guess. Follow step 4 (classify cause).


2) Early signal #2: the percentage of long transactions is increasing

What to look for (PostgreSQL)

Quick query to see long sessions and transactions:

SELECT pid, usename, state, xact_start, query_start, wait_event_type, wait_event, query
FROM pg_stat_activity
WHERE datname = current_database()
ORDER BY xact_start NULLS LAST;

Warning signs

  • old xact_start (transactions > 60–120s during peak hours)

  • many sessions with wait_event_type = Lock

Interpretation:even if the CPU is "ok", long transactions hijack concurrency (and with PgBouncer, they hijack server connections).

Useful threshold

  • Warning:1–3 transactions > 2 min during load hours

  • Critical:transactions > 5–10 min (almost always blocking/monster cron)


3) Early signal #3: throughput drops but demand does not (the "slow death")

What to look for

  • Requests/second (or jobs/second) vs latency

  • PgBouncer SHOW STATS; (if you are collecting it)

  • Odoo: p95/p99 latency by endpoint (login, listing, write, confirmations, reports)

Classic pattern before the incident

  • p95 rises slowly

  • p99 spikes first

  • throughput does not rise (or falls) even with normal traffic

Interpretation:you are no longer "scaling" with load. You are in contention.

Useful threshold

  • Warning:p99 > 2–3x your baseline

  • Critical:timeout errors or massive retries


4) Early signal #4: crons start to overlap (and no one is watching)

This is brutal in Odoo.

What to look for

  • duration of heavy crons

  • actual execution time vs expected

  • if they are overlapping (especially if you have max_cron_threads > 1)

Pattern before the incident

  • cron A takes longer → cron B starts the same → both compete for locks and DB

  • PgBouncer starts queuing

  • users notice slowness "in waves"

Useful threshold

  • Warning:cron that goes from X min to 2X min repeatedly

  • Critical:backlog (crons do not finish before their next run)


5) The key classification: 3 types of saturation (and what to do)

When you detect early saturation, classify it into one of these 3.

This avoids the typical mistake of "just increase pool_size and that's it".

Type A — Saturation bypool(config/concurrency)

Symptoms

  • cl_waiting rises

  • sv_idle ~ 0

  • Postgres is NOT at 100%

  • there are no major locks, just "a lot of movement"

Actions

  • carefully increase default_pool_size

  • add reserve_pool_size for spikes

  • check if max_client_conn or max_db_connections are limiting you

Type B — Saturation bylocks / long transactions

Symptoms

  • cl_waiting rises

  • maxwait rises sustained

  • in Postgres you see wait_event_type = Lock or very old xact_start

  • CPU not necessarily high (it's contention)

Actions

  • identify the long transaction (job/cron/user action)

  • short duration: batching, batch commits, avoid external I/O in the transaction

  • add lock_timeout, statement_timeout, and idle_in_transaction_session_timeout (according to policy)

Type C — Resource Saturation(CPU/RAM/I/O) (CPU/RAM/I/O)

Symptoms

  • CPU at 100% or high I/O wait

  • latency rises everywhere

  • PgBouncer may show a queue, but the root problem is the host/DB

Actions

  • optimize queries/indexes

  • reduce concurrency (workers/crons) to decrease contention

  • improve disk/IOPS

  • check for bloat/autovacuum if performance drops over time


6) A minimum set of "fire-fighting" alerts

If you could only create 6 alerts, they would be these:

PgBouncer

  1. cl_waiting > 0 for 2 min

  2. maxwait > 1s for 2 min

PostgreSQL

  1. transactions > 2 min (count > N)

  2. sessions waiting for locks > N

Odoo / app

  1. p99 latency > 2–3x baseline

  2. error rate (timeouts/5xx) > baseline


7) The trick that buys you time: "alert on trend", not on drop

Many monitor "CPU > 90%". That comes too late.

What buys you time is alerting onbehavior change:

  • p99 rises 30–50% compared to the baseline

  • maxwait goes from 0 to 0.5s and keeps rising

  • crons start to last 2x

This happens before the user feels the pain.

Close

If you want to detect saturation before the user does:

  • measure queue in PgBouncer,

  • measure long transactions and locks in Postgres,

  • measure p95/p99 in Odoo,

  • andmonitor cronsas if they were users (because they are, but more dangerous).


Next chapter ->

How to detect saturation before the user notices it (Odoo + PgBouncer + PostgreSQL)
John Wolf December 1, 2025
Share this post
Tags
Archive
Sign in to leave a comment