0

I have the following SQL table called readings.

date        |  today  | yesterday | tomorrow | creationtime               | source
2021-01-01      110       0.5         0        2021-01-01 12:42:17....       x1
2021-01-01      110       0.5         0        2021-01-01 12:42:17....       x2
2021-01-01      150       0.9         1        2021-01-01 12:55:17....       x3
....
2021-02-15      110       0.3         1        2021-02-15 12:42:17....       x1
2021-02-15      110       0.1         1        2021-02-15 12:42:17....       x2
2021-02-15      150       0.9         1        2021-02-15 12:55:17....       x3
...
2021-02-15      110       0.5         0        2021-02-16 16:06:04.008673    x17
2021-02-15      110       0.5         0        2021-02-15 15:59:46.383677    x17
....
2021-02-15      700       0.7         1        2021-02-16 16:04:02.267478    x20
2021-02-15      110       0.7         1        2021-02-15 15:59:48.060236    x20
....
2021-02-22      110       0.5         1        2021-02-15 16:01:16.826577    x55
2021-02-22      110       0.5         1        2021-02-16 16:09:17.524436    x55

There are 65 readings every day. Readings from x1, x2, x3... until x65.

So I found duplicate readings on certain days.

Sometimes the readings are different, so I want to keep the newer reading on that day, even though it was only recorded the following day.

I want to drop the duplicated values, I want to keep the newer creation time. So I want my table to end up looking like this.

date        |  today  | yesterday | tomorrow | creationtime               | source
2021-01-01      110       0.5         0        2021-01-01 12:42:17....       x1
2021-01-01      110       0.5         0        2021-01-01 12:42:17....       x2
2021-01-01      150       0.9         1        2021-01-01 12:55:17....       x3
....
2021-02-15      110       0.3         1        2021-02-15 12:42:17....       x1
2021-02-15      110       0.1         1        2021-02-15 12:42:17....       x2
2021-02-15      150       0.9         1        2021-02-15 12:55:17....       x3
...
2021-02-15      110       0.5         0        2021-02-16 16:06:04.008673    x17
....
2021-02-15      700       0.7         1        2021-02-16 16:04:02.267478    x20
....
2021-02-22      110       0.5         1        2021-02-16 16:09:17.524436    x55

I tried to do

create table new_readings as select distinct c.* from readings c;

But it just creates a copy of the table and drops values which are completely distinct.

3 Answers 3

2

It seems to be simply

select distinct on ("date", source) *
from readings
order by "date", source, creationtime desc;

which reads "pick only one (the latest) reading per source per day".

Sign up to request clarification or add additional context in comments.

6 Comments

OMG IT WORKED. why did you do "date" ? ive never seen the double quote syntax
Because date is a reserved word.
what do you mean? so when i use "date" its the date from my table and not the date which is the reserved word?
Well, yes, you can use non-conformant names if you enclose them in double quotes. Quoting 'date' helps not confuse the column name with the data type with the same name. However IMHO it's better to not use reserved words or non-conformant names at all.
Also what’s the difference between distinct and distinct on? Why do you also use an asterisk outside the bracket
|
0

The code below delete all duplicated "source" rows, by "creationtime"

delete from readings r1
    where exists(
        select * from readings r2
        where r1.creationtime > r2.creationtime
        and r1.source = r2.source
    )
order by r1.creationtime;

2 Comments

i updated my question, i think it wasnt clear enough, there are 65 sources everyday.
what is ur SGBD : oracle, postgresql, mysql ... ?
0

You can use distinct on:

select distinct on (date, today, yesterday, tomorrow ) r.*
from readings r
order by date, today, yesterday, tomorrow, creationtime desc;

5 Comments

it should be distinct on date and source, then only select one based on creation time
@anarchy add * after distinct on (...)
will it work though? did you see that on 2021-02-15, there are 2 different readings under today.
it doesnt work, i lose a lot of data @eshirvana
i updated my question, i think it wasnt clear enough, there are 65 sources everyday.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.