PostgreSQL 18 VACUUM and Asynchronous I/O (AIO): Reducing Bloat in Large Tables with Autovacuum Tuning

If you're running a production PostgreSQL database and suddenly queries are slowing down, disk usage keeps climbing, but the actual data hasn't grown nearly as much — it's almost certainly a bloat problem. I spent a long time puzzling over "why is this table so huge?" myself, but it's a natural consequence of PostgreSQL's MVCC architecture, and it can be kept under control with the right configuration.

PostgreSQL 18, officially released in September 2025, introduces a new asynchronous I/O (AIO) subsystem that delivers 2–3× throughput improvements on cold-cache sequential scans in cloud block storage environments. Since VACUUM benefits directly from AIO, dead tuples can be reclaimed far more quickly before bloat has a chance to accumulate. This article covers everything from how PG 18 AIO accelerates VACUUM, to autovacuum tuning settings and bloat removal strategies you can apply in production today — in order: diagnose → tune → monitor → remove.

Core Concepts

Why PostgreSQL Creates Bloat — The Structural Nature of MVCC

PostgreSQL does not physically delete existing rows immediately when an UPDATE or DELETE occurs. To guarantee transaction isolation, it simply marks the row as "no longer valid" (a dead tuple) while the space remains occupied. When these dead tuples accumulate, the table file grows larger than necessary — this is called bloat.

MVCC (Multi-Version Concurrency Control): A concurrency control mechanism that maintains multiple versions of data simultaneously so that reads and writes do not block each other. Adopted by most modern RDBMSes including PostgreSQL, MySQL InnoDB, and Oracle.

VACUUM reclaims these dead tuples so that subsequent INSERTs can reuse the freed space. However, a regular VACUUM does not return space to the OS — it only marks it as reusable internally — so the file size itself remains unchanged. Tools like VACUUM FULL or pg_repack are needed to return space to the OS.

The practical impact of bloat is more serious than you might expect:

Sequential scans must read all the useless pages too
Indexes also grow inefficiently by retaining pointers to dead tuples
shared_buffers cache efficiency degrades

PostgreSQL 18 Asynchronous I/O (AIO) Architecture

Prior to PostgreSQL 18 (version 17 and below), I/O was handled synchronously: issue a read request → wait for completion → issue the next request. The higher the storage latency, the more inefficient this became — a problem that was especially pronounced with cloud network storage (EBS, Azure Managed Disk, GCP PD).

PG 18 supports three I/O modes via the io_method parameter:

io_method	Behavior	Notes
`sync`	Legacy synchronous mode (same as PG 17 and below)	For backward compatibility
`worker`	Background I/O worker process queue-based	Default
`io_uring`	Direct use of Linux kernel ring buffer	Linux 5.1+, best performance

worker mode works by having backend processes register read requests in a shared memory queue, which separate I/O worker processes handle asynchronously, returning results to a completion queue. Request submission and completion waiting overlap, increasing overall throughput.

io_uring mode goes a step further — there are no worker processes at all; PostgreSQL and the Linux kernel share a ring buffer (a memory region shared between kernel and user space) directly. This architecture significantly reduces the number of kernel-mode ↔ user-mode transitions (context switches), yielding better performance than worker mode on NVMe SSDs and cloud storage.

io_uring: A high-performance asynchronous I/O interface introduced in Linux 5.1. By sharing a ring buffer between the kernel and user space, it reduces the number of system calls themselves. However, seccomp (security filter based on an allowed list of system calls) or AppArmor (access control framework) policies in container environments may block io_uring syscalls, so prior verification is required in containerized environments.

In PG 18, the default value of effective_io_concurrency has also been raised from 1 to 16. This value controls the number of parallel read requests that can be queued simultaneously in worker/io_uring mode, and affects all asynchronous I/O operations including VACUUM.

How AIO Works Within VACUUM

With the legacy synchronous approach, VACUUM scanned an entire table by reading one heap page → waiting for completion → reading the next page → waiting for completion.

In PG 18, based on the maintenance_io_concurrency setting (the number of concurrent I/O operations for maintenance tasks, default 16), multiple page read requests are queued simultaneously, and the next requests proceed concurrently while already-fetched pages are being processed. This "pipelining" effect is the key driver of VACUUM's speed improvement. io_combine_limit is the parameter that caps the maximum size when merging multiple I/O requests into a single larger request during this process (default 128kB).

Practical Application

Example 1: Assessing Current Bloat — Which Tables to Address First

Honestly, my first instinct was "can't I just crank up the global autovacuum settings?" — but since update patterns vary so much per table, per-table configuration turned out to be far more effective. Before tuning, the first step is identifying which tables are the problem.

sql

-- List tables with a high dead tuple ratio
SELECT
    schemaname,
    tablename,
    n_dead_tup,
    n_live_tup,
    ROUND(
        n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100,
        2
    ) AS dead_ratio,
    last_autovacuum,
    last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY dead_ratio DESC;

If you see tables with a dead_ratio above 20%, it's time to tune autovacuum. Above 30–40%, it's worth considering pg_repack.

For a more precise analysis, you can use the pgstattuple extension:

sql

-- Install pgstattuple (contrib extension)
CREATE EXTENSION IF NOT EXISTS pgstattuple;
 
-- Detailed bloat analysis for the orders table
SELECT * FROM pgstattuple('orders');

Looking at dead_tuple_percent and tuple_percent (the live tuple ratio) together in the output lets you quantify the severity of bloat.

Caution: pgstattuple performs a full sequential scan of the table. Running it on a table with hundreds of millions of rows during peak hours can cause load spikes — it is recommended to run it once a week during off-peak hours.

Example 2: Autovacuum Tuning for Large Tables

The PG default (autovacuum_vacuum_scale_factor = 0.2) means that on a table with 1 million rows, as many as 200,000 dead tuples must accumulate before vacuum is triggered. When autovacuum fires is determined by the following formula:

trigger condition = autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor × n_live_tup

With the default threshold (50) + 0.2 × 1,000,000 = 200,050 dead tuples required to trigger vacuum, bloat accumulates significantly. This is excessively conservative for modern NVMe SSD environments.

Global settings can be changed in postgresql.conf:

ini

# postgresql.conf — recommended settings for SSD environments
autovacuum_vacuum_cost_delay = 2        # default 2ms; 0–2ms on SSD
autovacuum_vacuum_cost_limit = 3000     # default 200 → 3000 recommended for SSD
autovacuum_max_workers = 5              # default 3 → increase for more parallelism
autovacuum_vacuum_scale_factor = 0.05   # default 0.2 → lower to 5%

However, a far more effective approach than global settings is applying per-table configuration to frequently updated tables. These take effect immediately without a server restart:

sql

-- Apply per-table settings to a frequently updated table (e.g., orders)
ALTER TABLE orders SET (
    autovacuum_vacuum_scale_factor = 0.01,   -- trigger vacuum immediately at 1% dead tuples
    autovacuum_vacuum_threshold = 1000,       -- only when at least 1000 (added to scale_factor)
    autovacuum_vacuum_cost_delay = 0,         -- critical table: no I/O throttling delay
    autovacuum_analyze_scale_factor = 0.005   -- also refresh statistics more frequently
);
 
-- Verify the settings
SELECT relname, reloptions
FROM pg_class
WHERE relname = 'orders';

autovacuum_vacuum_threshold = 1000 fills the threshold slot in the formula above. The structure is "run when 1000 + 0.01 × live_tup or more dead tuples exist," so vacuum will not trigger unless there are at least 1,000 dead tuples regardless of row count. This prevents excessive vacuuming on small tables while making large tables react much faster.

Example 3: PG 18 AIO Configuration and VACUUM I/O Monitoring

You can start by checking the current io_method in your environment:

sql

-- Check current io_method
SHOW io_method;
 
-- Check effective_io_concurrency (default 16)
SHOW effective_io_concurrency;

The configuration to enable io_uring in postgresql.conf looks like this:

ini

# postgresql.conf
# Enable io_uring on Linux 5.1+ environments
io_method = io_uring
 
# Number of concurrent I/O operations for maintenance tasks like VACUUM (controls AIO pipelining)
maintenance_io_concurrency = 16
 
# Number of prefetch read requests (only meaningful in worker/io_uring mode)
effective_io_concurrency = 16
 
# Maximum size limit for combined I/O requests
io_combine_limit = 128kB

The new pg_aios view added in PG 18 lets you query the asynchronous I/O pipeline status in real time:

sql

-- Real-time AIO pipeline query (new view in PG 18)
SELECT * FROM pg_aios;

pg_aios shows a list of asynchronous I/O requests currently being processed, including I/O worker ID, request type (read/write), processing mode, and target file information. If you query this from another session while VACUUM is running on a table with hundreds of millions of rows, you'll see output like this:

 worker_id | handle_type | mode | pending | ...
-----------+-------------+------+---------+----
         1 | read        | aio  | t       | ...
         2 | read        | aio  | t       | ...
         3 | read        | aio  | t       | ...

On small tables, VACUUM may finish so quickly that pg_aios always appears empty. This simply means AIO is processing that efficiently — you need a truly large table with hundreds of millions of rows to observe the pipelining state in action.

Parameter	Default	Role
`io_method`	`worker`	I/O mode selection (sync/worker/io_uring)
`effective_io_concurrency`	`16`	Number of concurrent parallel read requests
`maintenance_io_concurrency`	`16`	Number of concurrent I/O operations for maintenance tasks like VACUUM
`io_combine_limit`	`128kB`	Maximum size limit for combined I/O requests

Example 4: Online Bloat Removal with pg_repack

Even after autovacuum reclaims dead tuples, the file size does not shrink. Returning disk space to the OS requires rewriting the table from scratch, but VACUUM FULL requires an exclusive lock, making it impractical for 24/7 production environments. This is where pg_repack comes to the rescue.

bash

# Install pg_repack (Ubuntu/Debian example)
# Install the version matching your running PostgreSQL major version
apt-get install postgresql-18-repack
 
# Online table reorganization without an exclusive lock
pg_repack -h localhost -U postgres -d mydb -t orders
 
# Reorganize only indexes (leave the table as-is)
pg_repack -h localhost -U postgres -d mydb --index orders_pkey
 
# Reorganize the entire database (may take a long time)
pg_repack -h localhost -U postgres -d mydb

Understanding how pg_repack works makes it easier to use with confidence:

Start building a new copy while keeping the original table in service
Record any changes that occur during the copy into a separate log table
Apply the log to the new table once copying is complete
Swap old and new tables with a brief lock of under one second
Rebuild existing indexes concurrently as well

The Most Common Mistakes in Practice

Lowering only the global scale_factor without per-table settings: Even if you lower the global value to 0.05, small tables may get vacuumed far too often, saturating workers. Applying per-table settings via ALTER TABLE to your critical tables is far more precise.
Raising cost_limit too aggressively in one shot: I once jumped cost_limit from 200 to 3000 in a single change and watched overnight batch jobs slow down by 30 seconds each. It's recommended to increase it gradually — 500 → 1000 → 2000 → 3000 — while monitoring query latency at each step.
Tuning without monitoring bloat: Changing autovacuum settings is meaningless if you never verify whether they're having an effect. Periodically checking the last_autovacuum timestamp and the trend in n_dead_tup will make the before-and-after difference immediately visible.

Pros and Cons

Advantages

The most direct effect of introducing AIO is faster VACUUM. According to PostgreSQL HTX benchmarks, 2–3× throughput improvements on cold-cache sequential scans have been reported in cloud block storage environments. Even on local NVMe, improvements of 20–35% have been observed. The reason the effect is more pronounced on network storage like EBS, Azure Managed Disk, and GCP PD is that higher latency means greater gains from asynchronous pipelining.

Faster VACUUM naturally suppresses dead tuple accumulation, and the pg_aios view enables real-time visibility into the I/O pipeline — improving observability into VACUUM internals that were previously a black box. Combined with pg_repack, disk space can be returned to the OS with under one second of locking, and per-table fine-grained control via ALTER TABLE ... SET (autovacuum_*) takes effect immediately without a server restart.

Disadvantages and Caveats

Item	Details	Mitigation
io_uring environment requirements	Linux 5.1+ required; Windows/macOS not supported	Use worker mode on those platforms
Container security policies	io_uring syscalls may be blocked by seccomp/AppArmor	Fall back to worker mode or modify policy
Limited gains on local NVMe	~20–35% improvement vs. cloud (lower latency means smaller relative gain)	Still an improvement; application is recommended
Increasing vacuum cost_limit	Higher I/O usage → possible resource contention with production queries	Raise gradually with concurrent monitoring
Increasing max_workers	Proportional increase in shared_buffers and memory usage	Verify available server memory before applying
pgstattuple load	Full sequential table scan → unsuitable for real-time monitoring on large tables	Run once per week during off-peak hours

autovacuum_vacuum_cost_limit: The cap on "cost tokens" that autovacuum is allowed to spend on I/O. The default of 200 is a legacy of the HDD era; in SSD/NVMe environments, raising it to the 1000–3000 range has little impact on production query performance.

Comparison of Bloat Removal Methods

Method	Lock Level	Space Returned	Service Interruption	Recommended When
VACUUM	None	No (internal reuse only)	None	Routine maintenance
VACUUM FULL	Exclusive lock	Yes (returned to OS)	Yes	When a maintenance window is available
pg_repack	~1 second lock	Yes (returned to OS)	Minimal	24/7 production environments
pg_squeeze	~1 second lock	Yes (returned to OS)	Minimal	Alternative to pg_repack

Closing Thoughts

The real payoff of applying this combination in practice is the moment you see last_autovacuum timestamps appearing far more frequently than before. The combination of PostgreSQL 18 AIO and autovacuum tuning is a proven approach that structurally suppresses bloat in large tables through nothing more than configuration changes.

Three steps you can start right now:

Start by assessing your current bloat situation — run the pg_stat_user_tables query introduced earlier to pull a list of tables with high dead_ratio. This can be run immediately regardless of PG version.
Apply per-table autovacuum settings to your most frequently updated tables — something like ALTER TABLE [tablename] SET (autovacuum_vacuum_scale_factor = 0.01, autovacuum_vacuum_cost_delay = 0); can be applied instantly without downtime. After applying, compare the last_autovacuum timestamps and the n_dead_tup trend after 1–2 days. Users on PG 17 and below can see immediate results from steps 1 and 2 alone.
If you're planning to upgrade to PostgreSQL 18, start planning your io_method configuration — io_uring for Linux 5.1+ environments, and the default worker mode is sufficient for everything else. Right after upgrading, use the pg_aios view to verify that the AIO pipeline is actually running.

References

#PostgreSQL18#VACUUM#autovacuum#비동기IO#io_uring#MVCC#bloat#pg_repack#pgstattuple#dead-tuple

PostgreSQL 18 VACUUM and Asynchronous I/O (AIO): Reducing Bloat in Large Tables with Autovacuum Tuning | DEV BAK - 기술블로그

Database

PostgreSQL 18 VACUUM and Asynchronous I/O (AIO): Reducing Bloat in Large Tables with Autovacuum Tuning

Core Concepts

Why PostgreSQL Creates Bloat — The Structural Nature of MVCC

MVCC (Multi-Version Concurrency Control): A concurrency control mechanism that maintains multiple versions of data simultaneously so that reads and writes do not block each other. Adopted by most modern RDBMSes including PostgreSQL, MySQL InnoDB, and Oracle.

The practical impact of bloat is more serious than you might expect:

Sequential scans must read all the useless pages too
Indexes also grow inefficiently by retaining pointers to dead tuples
shared_buffers cache efficiency degrades

PostgreSQL 18 Asynchronous I/O (AIO) Architecture

PG 18 supports three I/O modes via the io_method parameter:

io_method	Behavior	Notes
`sync`	Legacy synchronous mode (same as PG 17 and below)	For backward compatibility
`worker`	Background I/O worker process queue-based	Default
`io_uring`	Direct use of Linux kernel ring buffer	Linux 5.1+, best performance

io_uring: A high-performance asynchronous I/O interface introduced in Linux 5.1. By sharing a ring buffer between the kernel and user space, it reduces the number of system calls themselves. However, seccomp (security filter based on an allowed list of system calls) or AppArmor (access control framework) policies in container environments may block io_uring syscalls, so prior verification is required in containerized environments.

How AIO Works Within VACUUM

With the legacy synchronous approach, VACUUM scanned an entire table by reading one heap page → waiting for completion → reading the next page → waiting for completion.

Practical Application

Example 1: Assessing Current Bloat — Which Tables to Address First

sql

-- List tables with a high dead tuple ratio
SELECT
    schemaname,
    tablename,
    n_dead_tup,
    n_live_tup,
    ROUND(
        n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100,
        2
    ) AS dead_ratio,
    last_autovacuum,
    last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY dead_ratio DESC;

If you see tables with a dead_ratio above 20%, it's time to tune autovacuum. Above 30–40%, it's worth considering pg_repack.

For a more precise analysis, you can use the pgstattuple extension:

sql

-- Install pgstattuple (contrib extension)
CREATE EXTENSION IF NOT EXISTS pgstattuple;
 
-- Detailed bloat analysis for the orders table
SELECT * FROM pgstattuple('orders');

Looking at dead_tuple_percent and tuple_percent (the live tuple ratio) together in the output lets you quantify the severity of bloat.

Caution: pgstattuple performs a full sequential scan of the table. Running it on a table with hundreds of millions of rows during peak hours can cause load spikes — it is recommended to run it once a week during off-peak hours.

Example 2: Autovacuum Tuning for Large Tables

trigger condition = autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor × n_live_tup

Global settings can be changed in postgresql.conf:

ini

# postgresql.conf — recommended settings for SSD environments
autovacuum_vacuum_cost_delay = 2        # default 2ms; 0–2ms on SSD
autovacuum_vacuum_cost_limit = 3000     # default 200 → 3000 recommended for SSD
autovacuum_max_workers = 5              # default 3 → increase for more parallelism
autovacuum_vacuum_scale_factor = 0.05   # default 0.2 → lower to 5%

However, a far more effective approach than global settings is applying per-table configuration to frequently updated tables. These take effect immediately without a server restart:

sql

-- Apply per-table settings to a frequently updated table (e.g., orders)
ALTER TABLE orders SET (
    autovacuum_vacuum_scale_factor = 0.01,   -- trigger vacuum immediately at 1% dead tuples
    autovacuum_vacuum_threshold = 1000,       -- only when at least 1000 (added to scale_factor)
    autovacuum_vacuum_cost_delay = 0,         -- critical table: no I/O throttling delay
    autovacuum_analyze_scale_factor = 0.005   -- also refresh statistics more frequently
);
 
-- Verify the settings
SELECT relname, reloptions
FROM pg_class
WHERE relname = 'orders';

Example 3: PG 18 AIO Configuration and VACUUM I/O Monitoring

You can start by checking the current io_method in your environment:

sql

-- Check current io_method
SHOW io_method;
 
-- Check effective_io_concurrency (default 16)
SHOW effective_io_concurrency;

The configuration to enable io_uring in postgresql.conf looks like this:

ini

# postgresql.conf
# Enable io_uring on Linux 5.1+ environments
io_method = io_uring
 
# Number of concurrent I/O operations for maintenance tasks like VACUUM (controls AIO pipelining)
maintenance_io_concurrency = 16
 
# Number of prefetch read requests (only meaningful in worker/io_uring mode)
effective_io_concurrency = 16
 
# Maximum size limit for combined I/O requests
io_combine_limit = 128kB

The new pg_aios view added in PG 18 lets you query the asynchronous I/O pipeline status in real time:

sql

-- Real-time AIO pipeline query (new view in PG 18)
SELECT * FROM pg_aios;

 worker_id | handle_type | mode | pending | ...
-----------+-------------+------+---------+----
         1 | read        | aio  | t       | ...
         2 | read        | aio  | t       | ...
         3 | read        | aio  | t       | ...

Parameter	Default	Role
`io_method`	`worker`	I/O mode selection (sync/worker/io_uring)
`effective_io_concurrency`	`16`	Number of concurrent parallel read requests
`maintenance_io_concurrency`	`16`	Number of concurrent I/O operations for maintenance tasks like VACUUM
`io_combine_limit`	`128kB`	Maximum size limit for combined I/O requests

Example 4: Online Bloat Removal with pg_repack

bash

# Install pg_repack (Ubuntu/Debian example)
# Install the version matching your running PostgreSQL major version
apt-get install postgresql-18-repack
 
# Online table reorganization without an exclusive lock
pg_repack -h localhost -U postgres -d mydb -t orders
 
# Reorganize only indexes (leave the table as-is)
pg_repack -h localhost -U postgres -d mydb --index orders_pkey
 
# Reorganize the entire database (may take a long time)
pg_repack -h localhost -U postgres -d mydb

Understanding how pg_repack works makes it easier to use with confidence:

Start building a new copy while keeping the original table in service
Record any changes that occur during the copy into a separate log table
Apply the log to the new table once copying is complete
Swap old and new tables with a brief lock of under one second
Rebuild existing indexes concurrently as well

The Most Common Mistakes in Practice

Lowering only the global scale_factor without per-table settings: Even if you lower the global value to 0.05, small tables may get vacuumed far too often, saturating workers. Applying per-table settings via ALTER TABLE to your critical tables is far more precise.
Raising cost_limit too aggressively in one shot: I once jumped cost_limit from 200 to 3000 in a single change and watched overnight batch jobs slow down by 30 seconds each. It's recommended to increase it gradually — 500 → 1000 → 2000 → 3000 — while monitoring query latency at each step.
Tuning without monitoring bloat: Changing autovacuum settings is meaningless if you never verify whether they're having an effect. Periodically checking the last_autovacuum timestamp and the trend in n_dead_tup will make the before-and-after difference immediately visible.

Pros and Cons

Advantages

Disadvantages and Caveats

Item	Details	Mitigation
io_uring environment requirements	Linux 5.1+ required; Windows/macOS not supported	Use worker mode on those platforms
Container security policies	io_uring syscalls may be blocked by seccomp/AppArmor	Fall back to worker mode or modify policy
Limited gains on local NVMe	~20–35% improvement vs. cloud (lower latency means smaller relative gain)	Still an improvement; application is recommended
Increasing vacuum cost_limit	Higher I/O usage → possible resource contention with production queries	Raise gradually with concurrent monitoring
Increasing max_workers	Proportional increase in shared_buffers and memory usage	Verify available server memory before applying
pgstattuple load	Full sequential table scan → unsuitable for real-time monitoring on large tables	Run once per week during off-peak hours

autovacuum_vacuum_cost_limit: The cap on "cost tokens" that autovacuum is allowed to spend on I/O. The default of 200 is a legacy of the HDD era; in SSD/NVMe environments, raising it to the 1000–3000 range has little impact on production query performance.

Comparison of Bloat Removal Methods

Method	Lock Level	Space Returned	Service Interruption	Recommended When
VACUUM	None	No (internal reuse only)	None	Routine maintenance
VACUUM FULL	Exclusive lock	Yes (returned to OS)	Yes	When a maintenance window is available
pg_repack	~1 second lock	Yes (returned to OS)	Minimal	24/7 production environments
pg_squeeze	~1 second lock	Yes (returned to OS)	Minimal	Alternative to pg_repack

Closing Thoughts

Three steps you can start right now:

Start by assessing your current bloat situation — run the pg_stat_user_tables query introduced earlier to pull a list of tables with high dead_ratio. This can be run immediately regardless of PG version.
Apply per-table autovacuum settings to your most frequently updated tables — something like ALTER TABLE [tablename] SET (autovacuum_vacuum_scale_factor = 0.01, autovacuum_vacuum_cost_delay = 0); can be applied instantly without downtime. After applying, compare the last_autovacuum timestamps and the n_dead_tup trend after 1–2 days. Users on PG 17 and below can see immediate results from steps 1 and 2 alone.
If you're planning to upgrade to PostgreSQL 18, start planning your io_method configuration — io_uring for Linux 5.1+ environments, and the default worker mode is sufficient for everything else. Right after upgrading, use the pg_aios view to verify that the AIO pipeline is actually running.

References

#PostgreSQL18#VACUUM#autovacuum#비동기IO#io_uring#MVCC#bloat#pg_repack#pgstattuple#dead-tuple

Core Concepts

Why PostgreSQL Creates Bloat — The Structural Nature of MVCC

PostgreSQL 18 Asynchronous I/O (AIO) Architecture

How AIO Works Within VACUUM

Practical Application

Example 1: Assessing Current Bloat — Which Tables to Address First

Example 2: Autovacuum Tuning for Large Tables

Example 3: PG 18 AIO Configuration and VACUUM I/O Monitoring

Example 4: Online Bloat Removal with pg_repack

The Most Common Mistakes in Practice

Pros and Cons

Advantages

Disadvantages and Caveats

Comparison of Bloat Removal Methods

Closing Thoughts

References

Core Concepts

Why PostgreSQL Creates Bloat — The Structural Nature of MVCC

PostgreSQL 18 Asynchronous I/O (AIO) Architecture

How AIO Works Within VACUUM

Practical Application

Example 1: Assessing Current Bloat — Which Tables to Address First

Example 2: Autovacuum Tuning for Large Tables

Example 3: PG 18 AIO Configuration and VACUUM I/O Monitoring

Example 4: Online Bloat Removal with pg_repack

The Most Common Mistakes in Practice

Pros and Cons

Advantages

Disadvantages and Caveats

Comparison of Bloat Removal Methods

Closing Thoughts

References

Recommended Posts

Switching to UUIDv7 Cut PostgreSQL Index I/O by 312x — How It Reduces B-tree Page Splits by 500x

ULID vs UUIDv7: The Standard, Encoding, and DB Compatibility Differences That Determine Your Primary Key Choice

How PostgreSQL 18's `uuidv7()` and `uuid_extract_timestamp()` Speed Up Time-Series Queries by 44% Without `created_at`

PostgreSQL 18 Asynchronous I/O (AIO): Why Cold Cache Sequential Scans Are 3× Faster and Benchmark Results