Python error: generating a scatter plot using matplotlib

Question

I am a python newbie suffering from how to import CSV file in matplotlib.pyplot I would like to see the relationship between hour (=how many hours people spent to play a video game) and level (=game level). and then I would like to draw a scatter plot with Tax in different colors between female(1) and male(0).So, my x would be 'hour' and my y would be 'level'.

my data csv file looks like this:

          hour gender level
0            8    1   20.00
1            9    1   24.95
2           12    0   10.67
3           12    0   18.00
4           12    0   17.50
5           13    0   13.07
6           10    0   14.45
...
...
499         12    1  19.47
500         16    0  13.28

Here's my code:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

df=pd.read_csv('data.csv')
plt.plot(x,y, lavel='some relationship')
plt.title("Some relationship")
plt.xlabel('hour')
plt.ylabel('level')
plt.plot[gender(gender=1), '-b', label=female]
plt.plot[gender(gender=0), 'gD', label=male]
plt.axs()
plt.show()

I would like to draw the following graph. So, there will be two lines of male and female.

y=level|           @----->male
       | @
       | *         *----->female
       |________________ x=hour

However, I am not sure how to solve this problem. I kept getting an error NameError: name 'hour' is not defined.

you probably forgot commas around hour, so Python is looking for a variable named hour instead of interpreting it as a string — maxbellec
– maxbellec, Commented Dec 4, 2017 at 11:39
You need to mind the syntax of python. [ is different from (. — ImportanceOfBeingErnest
– ImportanceOfBeingErnest, Commented Dec 4, 2017 at 11:42
@ImportanceOfBeingErnest Yes! Good thing to know. Thank you. — user
– user, Commented Dec 5, 2017 at 9:49

erocoar · Accepted Answer · 2017-12-04 11:41:40Z

2

Could do it in this way:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

df = pd.DataFrame(data={"hour": [8,9,12,12,12,13,10], 
                        "gender": [1,1,0,0,0,0,0],
                        "level": [20, 24.95, 10.67, 18, 17.5, 13.07, 14.45]})

df.sort_values("hour", ascending=True, inplace=True)

fig = plt.figure(dpi=80)
ax = fig.add_subplot(111, aspect='equal')

ax.plot(df.hour[df.gender==1], df.level[df.gender==1], c="red", label="male")
ax.plot(df.hour[df.gender==0], df.level[df.gender==0], c="blue", label="female")
plt.xlabel('hour')
plt.ylabel('level')

answered Dec 4, 2017 at 11:41

erocoar

5,9333 gold badges26 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

user Over a year ago

Thanks, but the number of each variable would be 500 not just 7. I would like to load the variable from my csv file.

erocoar Over a year ago

@user What is the problem with loading it and then doing all the steps in my answer? You just have to replace df = pd.DataFrame(data... with df=pd.read_csv('data.csv')

user Over a year ago

Yes, I did it, but the error message said "NameError: name 'hour' is not defined."

user Over a year ago

Traceback (most recent call last): ax.plot(df.hour[df.gender==1], df.level[df.gender==1], c="red", label="male") in getattr return object.__getattribute__(self, name) AttributeError: 'DataFrame' object has no attribute 'hour'

user Over a year ago

as you said, I just replaced df=pd.DataFrame(data...) with df=pd.read_csv("data.csv")

|

Collectives™ on Stack Overflow

Python error: generating a scatter plot using matplotlib

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related