1

I am using matplotlib to graph my results from a .dat file.

The data is as follows

1145, 2021-07-17 00:00:00, bob, rome, 12.75, 65.0, 162.75
1146, 2021-07-12 00:00:00, billy larkin, italy, 93.75, 325.0, 1043.75
114, 2021-07-28 00:00:00, beatrice, rome, 1, 10, 100
29, 2021-07-25 00:00:00, Colin, italy the third, 10, 10, 50
5, 2021-07-22 00:00:00, Veronica, canada, 10, 100, 1000
1149, 1234-12-13 00:00:00, Billy Larkin, 1123, 12.75, 65.0, 162.75

I want to print a years worth of data (Jan to Dec) in the proper sequence and have my labels show up as the months, instead of the long date.

Here is my code:

import matplotlib.pyplot as plt
import csv

x = []
y = []

with open('Claims.dat','r') as csvfile:
    #bar = csv.reader(csvfile, delimiter=',')
    plot = csv.reader(csvfile, delimiter=',')

    for row in plot:
        x.append(str(row[1]))
        y.append(str(row[6]))

plt.plot(x,y, label='Travel Claim Totals!', color='red', marker="o")
plt.xlabel('Months', color="red", size='large')

plt.ylabel('Totals', color="red", size='large')
plt.title('Claims Data:   Team Bobby\n Second Place is the First Looser', color='Blue', weight='bold', size='large')

plt.xticks(rotation=45, horizontalalignment='right', size='small')
plt.yticks(weight='bold', size='small', rotation=45)

plt.legend()
plt.subplots_adjust(left=0.2, bottom=0.40, right=0.94, top=0.90, wspace=0.2, hspace=0)
plt.show()

enter image description here

0

2 Answers 2

1

I think the easiest way is to resort the data based on the date, which can be constructed using the datetime package. Here is a min working example, based on your data

import datetime

def isfloat(value: str):
  try:
    float(value)
    return True
  except ValueError:
    return False

def isdatetime(value: str):
  try:
    datetime.datetime.fromisoformat(value)
    return True
  except ValueError:
    return False

data = r"""1145, 2021-07-17 00:00:00, bob, rome, 12.75, 65.0, 162.75
1146, 2021-07-12 00:00:00, billy larkin, italy, 93.75, 325.0, 1043.75
114, 2021-07-28 00:00:00, beatrice, rome, 1, 10, 100
29, 2021-07-25 00:00:00, Colin, italy the third, 10, 10, 50
5, 2021-07-22 00:00:00, Veronica, canada, 10, 100, 1000
1149, 1234-12-13 00:00:00, Billy Larkin, 1123, 12.75, 65.0, 162.75"""

for idx in range(len(data)):
  data[idx] = data[idx].split(', ')
  for jdx in range(len(data[idx])):
    if data[idx][jdx].isnumeric():    # Is it an integer?
      value = int(data[idx][jdx])
    elif isfloat(data[idx][jdx]):     # Is it a float?
      value = float(data[idx][jdx])
    elif isdatetime(data[idx][jdx]):  # Is it a date?
      value = datetime.datetime.fromisoformat(data[idx][jdx])
    else:
      value = data[idx][jdx]
    data[idx][jdx] = value

data.sort(key=lambda x: x[1])

You can also sort by more specific things:

data.sort(key=lambda x: x[1].month)

Note: You might not need all the logic in the for-loop. I think the csv package does some basic preprocessing for you, such as splitting and data type conversion.

Sign up to request clarification or add additional context in comments.

Comments

0

Imports and DataFrame

import pandas as pd
import matplotlib.dates as mdates  # used to format the x-axis
import matplotlib.pyplot as plt

# read in the data
df = pd.read_csv('Claims.dat', header=None)

# convert the column to a datetime format, which ensures the data points will be plotted in chronological order
df[1] = pd.to_datetime(df[1], errors='coerce').dt.date

# display(df)
      0           1              2                 3      4      5        6
0  1145  2021-07-17            bob              rome  12.75   65.0   162.75
1  1146  2021-07-12   billy larkin             italy  93.75  325.0  1043.75
2   114  2021-07-28       beatrice              rome   1.00   10.0   100.00
3    29  2021-07-25          Colin   italy the third  10.00   10.0    50.00
4     5  2021-07-22       Veronica            canada  10.00  100.0  1000.00
5  1149  2020-12-13   Billy Larkin              1123  12.75   65.0   162.75

Plotting the DataFrame

# plot the dataframe, which uses matplotlib as the backend
ax = df.plot(x=1, y=6, marker='.', color='r', figsize=(10, 7), label='Totals')

# format title and labels
ax.set_xlabel('Months', color="red", size='large')
ax.set_ylabel('Totals', color="red", size='large')
ax.set_title('Claims Data:   Team Bobby\n Second Place is the First Looser', color='Blue', weight='bold', size='large')

# format ticks
xt = plt.xticks(rotation=45, horizontalalignment='right', size='small')
yt = plt.yticks(weight='bold', size='small', rotation=45)

# format the dates on the xaxis
myFmt = mdates.DateFormatter('%b')
ax.xaxis.set_major_formatter(myFmt)

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.