0

Say I have a table with a thousand users and 50 million user_actions. A few users have more than a million actions but most have thousands.

CREATE TABLE users (id, name)
CREATE TABLE user_actions (id, user_id, created_at)
CREATE INDEX index_user_actions_on_user_id ON user_actions(user_id)

Querying user_actions by user_id is fast, using the index.

SELECT * 
FROM user_actions 
WHERE user_id = ? 
LIMIT 1

But I'd like to know the last action by a user.

SELECT * 
FROM user_actions 
WHERE user_id = ? 
ORDER BY created_at DESC 
LIMIT 1

This query throws out the index and does a table scan, backwards until it finds an action. Not a problem for users that have been active recently, too slow for users that haven't.

Is there a way to tune this index so postgres keeps track of the last action by each user? (For bonus points the last N actions!)

Or, suggested alternate strategies? I suppose a materialized view of a window function will do the trick.

1

1 Answer 1

1

Create an index on (user_id, created_at)

This will allow PostgreSQL to do a index scan to locate the first record.

This is one of the cases where multi-column indexes make a big difference.

Note we put user_id first because that allows us to efficiently select the sub-portion of the index we are interested in, and then from there it is just a quick traversal to get the most recent created_at date, provided not a lot of dead rows in the area.

Sign up to request clarification or add additional context in comments.

2 Comments

might want to order that by desc as well depending on how the SQL is written
Maybe but you can scan an index forward or backward so not sure in this case if this query would care that much

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.