We have some queries that are running extremely slowly intermittently in our production environment. These are JSONB intersection queries which normally return in milliseconds, but are taking 30-90 seconds.
We have tried to look at co-occurring server conditions such as RAM, CPU and query load, but there is nothing obvious. This affects a very small minority of queries - probably less than 1%. This does not appear to be a query optimization issue as the affected queries themselves are varied and in some cases very simple.
We've reproduced the same environment as far as possible on a staging server and loaded it heavily and the issue does not occur.
Can anyone suggest possible steps to investigate what is occurring in Postgres when this happens, or anything else we should consider? We have been working on this for over a week and are running out of ideas.