I want to see how long the client spend time connecting to our website daily. My table source in created as below and contains the data as shown below.
CREATE TABLE source_ (
"nbr" numeric (10),
"begdate" timestamp,
"enddate" timestamp,
"str" varchar(35))
;
INSERT INTO source_
("nbr", "begdate", "enddate", "str")
VALUES
(111, '2019-11-25 07:00:00', '2019-11-25 08:00:00', 'TMP123'),
(222, '2019-03-01 12:04:02', '2019-03-01 12:05:02', 'SOC'),
(111, '2019-11-25 19:00:00', '2019-11-25 19:30:00', 'TMP12'),
(444, '2020-02-11 22:00:00', '2020-02-12 02:00:00', 'MARATEN'),
(444, '2020-02-11 23:00:00', '2020-02-12 01:00:00', 'MARA12'),
(444, '2020-02-12 13:00:00', '2020-02-12 14:00:00', 'MARA12'),
(444, '2020-02-12 07:00:00', '2020-02-12 08:00:00', 'MARA1222')
;
create table target_ (nbr numeric (10), date_ int(10), state varchar(30), terms interval);
I did an attempt below, but as you can see i associated the date_ (day of the event) to the beddate which is not always true see (4th row) when the event is between two days.
INSERT INTO target_
(nbr, date_, state, terms)
select
nbr,
DATE_TRUNC('day', begdate) as date_,
state,
sum(term) as terms
from (
select
nbr, begdate,
(case
when trim(str) ~ '^TMP' then 'TMP'
when trim(str) ~ '^MARA' then 'MARATEN'
else 'SOC'
end) as state,
(enddate - begdate)as term from source_ ) X
group by nbr, date_, state;
expected output
111 2019-11-25 00:00:00+00 TMP 90
222 2019-03-01 00:00:00+00 SOC 60
444 2020-02-11 00:00:00+00 MARATEN 180
444 2020-02-12 00:00:00+00 MARATEN 300