3

I need to select some values, group them and order by multiple fields. Here is my fiddle: http://sqlfiddle.com/#!2/a80eb/3

What I need to achieve, is to select one row from the table packet_data for a given value of client_mac and for each distinct item in the column drone_id. This row should contain the value of client_mac, drone_id and the most frequent value of the antenna_signal column for the given combination of drone_id and client_mac.

The column drone_id is a foreign key into the table drones and there is a column map_id in this table. I need to concern only those rows from the packet_data table which have a certain map_id in the drones table.

My desired result should be this:

CLIENT_MAC          DRONE_ID    ANTENNA_SIGNAL
3c:77:e6:17:9d:1b   1           -37
3c:77:e6:17:9d:1b   2           -57

My current SQL query is:

SELECT `packet_data`.`client_mac`,
       `packet_data`.`drone_id`,
       `packet_data`.`antenna_signal`,
       count(*) AS `count`
FROM `packet_data`
JOIN `drones` ON `packet_data`.`drone_id`=`drones`.`custom_id`
WHERE `drones`.`map_id` = 11
  AND `client_mac`="3c:77:e6:17:9d:1b"
GROUP BY drone_id,
         `packet_data`.`antenna_signal`
ORDER BY `packet_data`.`drone_id`,
         count(*) DESC

And my current result:

CLIENT_MAC          DRONE_ID    ANTENNA_SIGNAL
3c:77:e6:17:9d:1b   1           -37
3c:77:e6:17:9d:1b   1           -36
3c:77:e6:17:9d:1b   2           -57
3c:77:e6:17:9d:1b   2           -56
5
  • I like the result, but not they way you achieved it :) I need the value of "antenna_signal" which is repeating for the most times, not the one with the smallest value. Commented Jun 17, 2014 at 20:19
  • Yeah that why I deleted it. Commented Jun 17, 2014 at 20:20
  • Can you pls go deeper? An example maybe? I have no experience with ties yet. Commented Jun 17, 2014 at 20:39
  • He means when the count values are equal... Commented Jun 17, 2014 at 20:40
  • Aaaah .. my insufficient English :) In that case AVG() would be great, but one random value from the ones with the same count should work too. Commented Jun 17, 2014 at 20:46

1 Answer 1

2

You can get your desired result with a not very nice correlated subquery (on a subquery too). I don't know how it will be scaling with a huge amount of data:

SELECT
    -- the desired columns
    client_mac,
    drone_id,
    antenna_signal,
    amount               -- I added this so I could easily check the result
FROM (
    -- give me the count of every value of the antenna_signal column 
    -- for each combination of client_mac, drone_id and antenna_signal
    SELECT 
        client_mac,
        antenna_signal,
        drone_id,
        COUNT(antenna_signal) AS amount
    FROM
        packet_data
    WHERE
        client_mac = '3c:77:e6:17:9d:1b'
    GROUP BY
        client_mac,
        drone_id,
        antenna_signal
) as1
WHERE
    -- but I want only those rows with the highest count of equal antenna_signal
    -- values per client_mac and drone_id
    amount = (
        SELECT 
            MAX(as2.amount)
        FROM (
            SELECT 
                pd2.client_mac,
                pd2.antenna_signal,
                pd2.drone_id,
                COUNT(pd2.antenna_signal) AS amount
            FROM
                packet_data pd2
            WHERE
                client_mac = '3c:77:e6:17:9d:1b'
            GROUP BY
                client_mac,
                drone_id,
                antenna_signal
            ) as2
        WHERE 
            as1.client_mac = as2.client_mac AND as1.drone_id = as2.drone_id
);

It shouldn't be too difficult to join other tables if it is desired. But this will display both rows, if there are two antenna_signals with equal count for the same client_mac and drone_id. See it in the updated fiddle http://sqlfiddle.com/#!2/a80eb/80

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks a lot! I expect to have several millions of rows :) do you think MySQL will handle it?
@filo891 It's a dependent subquery. I don't believe it will scale. And I feared your amount of data too. You could try to select the innermost subquery into a temporary table and query this temporary table. Even then it's a correlated subquery and it may very well take a lot of time to execute. How often will you do this?
So let me explain.. My DB contains data collected in real-time from several WiFi sniffers. I have an application, which is doing real-time localization of WiFi clients. so I need to run this query for all currently present MAC addresses (interval of 30 seconds of collected data - it might be even 200 different addresses) at least once per 10-20 seconds to make the localization as real-time as possible. I will have to reduce amount of collected/stored data for sure, but anyway I need to keep some history for at least few days, so I'm sure that millions of rows will be a real amount.
@filo891 That sounds reasonable. But for the real-time localization you need only recent data. So I would either dump all data to a history table and clear the recent data or do it the other way and get only the recent data into the temporary table for the real-time analysis. I believe I would prefer the second option. Test it with real data, then optimize.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.