Table of Contents
Issue : PostgreSQL – Application team reported response slowness.
Environment:
PostgreSQL 9.6 cluster with primary-standby configuration.
Any recent change in environment: A new agent was on-boarded.
Investigation:
Sever I/O and CPU utilization : iostat -x 1 1000
Note: CPU utilization was 99.4% and it’s consumed by users activity.
Memory: free -h
28Gb free memory was available on server.
Wait Events:
select datname, usename, application_name ,query_start,query,
wait_event_type,wait_event,state from pg_stat_activity where wait_event_type is not null order by
query_start desc,wait_event_type,wait_event ;
2 major wait events were reported WALWriteLock and buffer_mapping.
Solution:
wait event buffer_mapping shows pressure on database shared_buffers. It was perfect scenario to fix the shared_buffers contention.Increased shared_buffers to 22GB from current value of 16GB to fix the issue. Note: changing shared_buffers is not online activity.
Root Cause: On-boarding of new agent put extra pressure on PostgreSQL database. Server was not capacitated properly.
Useful command –
To monitor disk io, cpu and memory
iostat -x 1 100
top
pg_top
free -h
To monitor database activity –
- pg_stat_activity
select datname, usename, application_name ,query_start,query, wait_event_type,wait_event,state from pg_stat_activity where wait_event_type is not null order by query_start desc,wait_event_type,wait_event ;
- pg_stat_statements
select query, total_time, calls,blk_read_time,blk_write_time from
pg_stat_statements order by blk_read_time desc, blk_write_time desc ;
Checkout more PostgreSQL posts