0

I have MySql database with such tables:

PageviewEvents:
pageviewId | eventId | eventValue | eventTime

SessionPageviews:
id | sessionId | page 

PageviewEvents.pageviewId is a refers to SessionPageviews.id as foreign key.

When I need to select some data by eventId, I use this query:

SELECT 
    sp.page as Page, count(*)
from PageviewEvents pe
left join SessionPageviews sp on sp.id = pe.pageviewId
where pe.eventId = 1
GROUP by sp.page
order BY 2 DESC

And get a table like this:

page | count_of_event_1

But now I need to select more data:

page | count_of_event_1 | count_of_event_2 ... | count_of_event_N

I started with 2 events and tried to white something like this:

SELECT 
    sp.page as Page, 
    (SELECT count(*) from PageviewEvents pe1 left join SessionPageviews sp1 on sp1.id = pe1.pageviewId where pe1.eventId = 1 and sp1.page = sp.page) as count_of_event_1,
    (SELECT count(*) from PageviewEvents pe1 left join SessionPageviews sp1 on sp1.id = pe1.pageviewId where pe1.eventId = 2 and sp1.page = sp.page) as count_of_event_2 
from PageviewEvents pe
left join SessionPageviews sp on sp.id = pe.pageviewId
where pe.eventId = 1 OR pe.eventId = 2
GROUP by sp.page
order BY 2 DESC

When I run this query on remote server, it freezes.

Are there any errors in my query? How to optimize it?

2 Answers 2

2

You could try using conditional aggregation:

SELECT
    sp.page AS Page,
    COUNT(CASE WHEN pe.eventId = 1 THEN 1 END) AS count_of_event_1,
    COUNT(CASE WHEN pe.eventId = 2 THEN 1 END) AS count_of_event_2
FROM PageviewEvents pe
LEFT JOIN SessionPageviews sp
    ON sp.id = pe.pageviewId
WHERE
    pe.eventId IN (1, 2)
GROUP BY
    sp.page
ORDER BY
    2 DESC;

Beyond the above, you may consider adding the following index to your table:

CREATE INDEX idx ON SessionPageviews (pageviewId, eventId);

This might help speed up the join between the two tables.

Sign up to request clarification or add additional context in comments.

Comments

1

First, your query is suspicious. You are using a LEFT JOIN, but you are aggregating by a column in the second table. I doubt you really want a row with a NULL first column.

You can write the query using conditional aggregation (as Tim) points out. I would express this as:

select sp.page as Page, 
       sum( pe.eventid = 1 ) as count_of_event_1,
       sum( pe.eventid = 1 ) as count_of_event_2
from SessionPageviews sp join
     PageviewEvents pe
     on sp.id = pe.pageviewId
where pe.eventId in (1, 2)
group by sp.page
order by 2 desc;

Then for this query, there are two indexing strategies. If you have many types of events (or if 1 and 2 are relatively rare), then:

  • SessionPageviews(id, page)
  • PageviewEvents(eventId, pageviewId)

Otherwise:

  • SessionPageviews(page, id)
  • PageviewEvents(pageviewId, eventId)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.