6

I'm trying to remove the weekend gaps from this time series plot. The x-axis is a data time stamp. I've tried the code on this site, but can't get it to work. See sample file used

The data looks like this

+-----------------------+---------------------+-------------+-------------+
|          asof         |    INSERTED_TIME    | DATA_SOURCE |    PRICE    |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:00:15 | DB          | 170.4261757 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:06:10 | DB          | 168.9348656 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:06:29 | DB          | 168.8412129 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:07:27 | DB          | 169.878796  |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:10:28 | DB          | 169.3685879 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:12:14 | DB          | 169.0787045 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:12:33 | DB          | 169.7561092 |
+-----------------------+---------------------+-------------+-------------+

Plot including weekend breaks

Using the line function I'm getting the plot below, with straight lines going from Friday end of day to Monday morning. Using px.scatter, I don't get the line, but I still get the gap.

import plotly.express as px
import pandas as pd

sampledf = pd.read_excel('sample.xlsx')

fig_sample = px.line(sampledf, x = 'INSERTED_TIME', y= 'PRICE', color = 'DATA_SOURCE')
fig_sample.show()

enter image description here

Attempt with no weekend breaks

fig_sample = px.line(sampledf, x = 'INSERTED_TIME', y= 'PRICE', color = 'DATA_SOURCE')
fig_sample.update_xaxes(
    rangebreaks=[
        dict(bounds=["sat", "mon"]) #hide weekends
    ]
)
fig_sample.show()

enter image description here

Using rangebreaks results in a blank plot.

Any help is appreciated. Thanks

4
  • Does this answer your question? How to remove weekend datetime gaps from x-axis of a financial chart? Commented Jul 9, 2020 at 22:09
  • Thanks, but it doesn't solve my problem. I know I must not be using rangebreaks right? Commented Jul 11, 2020 at 22:56
  • Can't I just remove the weekend from the original data and create a graph? Commented Jul 12, 2020 at 3:15
  • @r-beginners There are no weekend in that data you can check with df["INSERTED_TIME"].dt.weekday.unique() Commented Jul 12, 2020 at 3:17

3 Answers 3

6

There is a limitation of 1000 rows when using rangebreaks When working with more than 1000 rows, add the parameter render_mode='svg'

In the code below I've used the scatter function, but as you can see the large weekend gaps are not longer there. Additionally I've excluded the times between 11PM and 11AM

sampledf = pd.read_excel('sample.xlsx')

fig_sample = px.scatter(sampledf, x = 'INSERTED_TIME', y= 'PRICE', color = 'DATA_SOURCE', render_mode='svg')
fig_sample.update_xaxes(
    rangebreaks=[
        { 'pattern': 'day of week', 'bounds': [6, 1]}
        { 'pattern': 'hour', 'bounds':[23,11]}
    ]
)
fig_sample.show()

enter image description here

The values in the plot are different from the original data set, but will work with the data in the original post. Found help here

Sign up to request clarification or add additional context in comments.

1 Comment

where would you find all options for pattern?
1

You can also use render_mode='svg' on px.line

import plotly.express as px
import pandas as pd
    
sampledf = pd.read_excel('sample.xlsx')
    
fig_sample = px.line(sampledf, x = 'INSERTED_TIME', y= 'PRICE', color = 'DATA_SOURCE', render_mode='svg')
fig_sample.update_xaxes(
    rangebreaks=[
dict(bounds=["sat", "mon"])]
)

fig_sample.show()

However, for px.timeline or other px.object that don't have render_mode you should use :

dict(pattern = "hour", dvalue = 60*60*1000,values = start_of_break)

start_of_break is a list date of every break you want. dvalue is the duration of each break. Here 60 minutes * 60 seconds * 1000 ms.

Comments

0

Looks like the x axis on the blank plot does not even have the right range, since it begins in a different year. It's hard to explain the behavior without looking at the exact data input, but you can start with a working, simpler, dataset and try to check for differences (try to plot a filtered version of the data with select points or check for differences in the dtypes of the DataFrame, etc).

You will see the expected behavior with a simpler dataset:

import plotly.express as px
import pandas as pd
from datetime import datetime
d = {'col1': [datetime(2020, 5, d) for d in range(1, 30)],
     'col2': [d if (d + 3) % 7 not in (5, 6) else 0 for d in range(1, 30)]}
df = pd.DataFrame(data=d)
df.set_index('col1')

df_weekdays = df[df['col1'].dt.dayofweek.isin([0,1,2,3,4])]

f = px.line(df, x='col1', y='col2')
f.update_xaxes(
    rangebreaks=[
        dict(bounds=["sat", "mon"]), #hide weekends
    ]
)
f.show()

with breaks

For the DataFrame without weekends, df_weekdays, it's a similar image:

enter image description here

1 Comment

the exact data input in included in the original post.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.