0

I'm new to python and I'm practicing csv data operation. Now I have a situation where I have 100 * 4 datas where the i-th row corresponds to i-th example (x_i, y_i) Columns 1,2, and 3 are for each input variable (x_1, x_2, 1) abd 4-th column is output y. Then, determine appropriate constants (a,b) (Let's say (6, 5) here).

Now I want to draw graphs, the x-axis being ax_1+bx_2 and y-axis being y.

here are the parts of the datas.

"x1","x2","1","y"
-0.626453810742332,-0.620366677224124,1,0.28239638205273
0.183643324222082,0.0421158731442352,1,1.73290072129656
-0.835628612410047,-0.910921648552446,1,-0.293950695808836

and here is the code im trying to complete

import numpy as np
import matplotlib.pyplot as plt

if __name__ == '__main__':

    x1 = []
    x2 = []
    y = []
    a = 6
    b = 5

    with open('data.csv', 'r') as f:
        lines = f.readlines()
        tmp_x1 = np.array(lines)[:,0]
        tmp_x2 = np.array(lines)[:,1]
        tmp_y = np.array(lines)[:,3]
        x1.append(tmp_x1)
        x2.append(tmp_x2)
        y.append(tmp_y)

    ax1 = list()
    bx2 = list()

    for i in range(100):
        ax1.append(a * x1)
        bx2.append(b * x2)

    plt.figure()
    plt.xlabel('a*x1 + b*x2')
    plt.ylabel('y')
    plt.plot(ax1 + bx2, y)

and this code gives me error

IndexError: too many indices for array

Can someone help me to solve this?

[edit][graph]1

code:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('data.csv')

a,b = 6,5

df['x'] = df['x1']*a + df['x2']*b

plt.figure()
plt.plot(df['x'], df['y'])
plt.show()

2 Answers 2

1

f.readlines() will read everything in the file and put it into the list. The list is only a one dimensional array with length number of numbers in the file. You can use the numpy function genfromtxt to load the data directly into a numpy array

lines = np.genfromtxt('data.csv', delimiter=',' ,skip_header=1)
Sign up to request clarification or add additional context in comments.

Comments

0

I recommend using pandas to handle tabular data:

import pandas as pd

df = pd.read_csv('data.csv')

a,b = 6,5

df['x'] = df['x1']*a + df['x2']*b

df.plot(x='x',y='y')

Output:

enter image description here

Or in the case you want to plot one line for each input identified by field 1:

# continue the code above
# replace the last line with
fig, ax = plt.subplots()
for k,d in df.groupby('1'):
    d.plot(x='x',y='y', ax=ax, label=k)

Output:

enter image description here

1 Comment

thank you for the answer. I ran this code but come up with no graphs, so i rewrite the code using matplotlib, then the plot was like this (added to the post). what am i missing?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.