1

Need to plot multiple lines (large dataset - will yield ~4,500 lines) on interactive 2-dimensional line graph using Plotly Express.

Problem is that my variables (x and y axes) are in 2 separate columns, and the number of data points for each line is different. The number of data points for each line will be based upon the 'API/UWI' column --- the rows where the 'API/UWI' values are constant will represent all the data points for 1 line.

i.e., when the value in the 'API/UWI' column changes, a new line starts.

An example of a small portion of my dataset is below for clarity.

Example dataset

In my first attempt, I separated the dataframe into multiple dataframes based on the unique values in the 'API/UWI' column and plotted all of those dataframes into graphs. It worked! However, it produced 4,500 graphs, rather than a single graph with 4,500 lines.

Is there a better way to accomplish this and produce a single graph with 4,500 lines?

I am posting my code below for the attempt I made above, along with an example of a graph that was produced (with a single line).

Please be detailed as possible on your solutions. This is my FIRST attempt at coding of any kind. I'm very much a beginner. PLEASE & THANKS!

    import plotly.express as px
    import pandas as pd
    import numpy as np

    excel_file = r"C:\Users\kevin\Desktop\Bone_Spring_Data_2.xlsx"

    df = pd.read_excel(excel_file)

    split_values_API = df['API/UWI'].unique()

    for API in split_values_API:
        df1 = df[df['API/UWI'] == API]
        df1 = df1.sort_values(by="Monthly Production Date")
        GOR_data = px.line(df1, x='Monthly Production Date' , y='MONTHLY GOR')
        GOR_data.show()

Example plot with 1 line

*Edit in response to comment:

The output of df.head(21).to_dict() is as follows:

{'API/UWI': {0: 30015209400000,
  1: 30015209400000,
  2: 30015209400000,
  3: 30015209400000,
  4: 30015209400000,
  5: 30015209400000,
  6: 30015209400000,
  7: 30015209400000,
  8: 30015221570000,
  9: 30015221570000,
  10: 30015221570000,
  11: 30015221570000,
  12: 30015221620000,
  13: 30015221620000,
  14: 30015221620000,
  15: 30015221620000,
  16: 30015221620000,
  17: 30015221620000,
  18: 30015221620000,
  19: 30015221620000,
  20: 30015221620000},
 'Monthly Production Date': {0: Timestamp('2002-04-01 00:00:00'),
  1: Timestamp('2002-05-01 00:00:00'),
  2: Timestamp('2002-06-01 00:00:00'),
  3: Timestamp('2002-07-01 00:00:00'),
  4: Timestamp('2002-08-01 00:00:00'),
  5: Timestamp('2002-09-01 00:00:00'),
  6: Timestamp('2002-10-01 00:00:00'),
  7: Timestamp('2006-07-01 00:00:00'),
  8: Timestamp('2008-08-01 00:00:00'),
  9: Timestamp('2008-09-01 00:00:00'),
  10: Timestamp('2008-10-01 00:00:00'),
  11: Timestamp('2008-11-01 00:00:00'),
  12: Timestamp('2016-10-01 00:00:00'),
  13: Timestamp('2016-11-01 00:00:00'),
  14: Timestamp('2016-12-01 00:00:00'),
  15: Timestamp('2017-01-01 00:00:00'),
  16: Timestamp('2017-02-01 00:00:00'),
  17: Timestamp('2017-03-01 00:00:00'),
  18: Timestamp('2017-04-01 00:00:00'),
  19: Timestamp('2017-05-01 00:00:00'),
  20: Timestamp('2017-06-01 00:00:00')},
 'MONTHLY GOR': {0: 1.278688524590164,
  1: 0.8455284552845529,
  2: 1.8529411764705883,
  3: 0.736,
  4: 1.6818181818181819,
  5: 0.9795918367346939,
  6: 0.5303030303030303,
  7: 0.0,
  8: 14.523809523809524,
  9: 17.07622203811102,
  10: 16.334231805929917,
  11: 14.918367346938776,
  12: 1.4124008651766402,
  13: 1.8545081967213115,
  14: 1.2862351868732909,
  15: 1.4340557275541796,
  16: 2.2898674647285167,
  17: 2.7108673978065805,
  18: 14.311827956989248,
  19: 2.871877001921845,
  20: 2.8629370629370627}}

This is an example dataset to match my example (attached figure).

11
  • Hi @cowboykevin05, welcome on SO. Please read about mcve and how-to-ask. In particular it will be great if you can share the output of df.head(20).to_dict() instead a picture. Commented Jun 17, 2020 at 16:58
  • 1
    You should edit your own question as there is a limit for characters on comments. Commented Jun 17, 2020 at 17:13
  • 1
    @rpanai Thanks. I edited the post to answer your question. Note the comment at the bottom. I will read through the links you sent. Commented Jun 17, 2020 at 17:19
  • @rpanai I'm not sure why it's not showing in the post, but the last section of the output of df.head(20).to_dict() is supposed to say 'MONTHLY GOR': {0: 1.1925754060324827, Commented Jun 17, 2020 at 17:32
  • don't worry I fixed it. Commented Jun 17, 2020 at 17:35

1 Answer 1

1

First of all if you want to use plotly.express you might just use

px.line(df,
        x='Monthly Production Date',
        y='MONTHLY GOR',
        color='API/UWI')

the parameter color allows you to have a line for every different 'API/UWI' in your df.

If you want to use plotly.graph_objs instead you can do

import plotly.graph_objs as go
fig = go.Figure()
for ts in df.groupby("API/UWI"):
    ts = ts[1]
    fig.add_trace(go.Scatter(x=ts['Monthly Production Date'],
                             y=ts['MONTHLY GOR']))
fig.show()

In this case you are adding a trace for every "API/UWI".

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you! This worked. The reason I chose to use Plotly Express is because it seemed to work really well for hovering over the data with your mouse and seeing additional specified information. I will still need to figure that out for the plotly.graph_objs as go method. If you know of a good source to find that information, please let me know. I still have difficulties putting my questions to words as a novice. Thanks again!
also, I'm confused as to why there is an index value of 1 at ts = ts[1] . Can you explain that please?
Whenever you take an element ts within a groupby this is a tuple where the first element ts[0]is the key for the groupby while the second one ts[1] is a dataframe. you could check printing out ts, ts[0] and ts[1] at the end of the loop.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.