0

(NB: I'm very new to python and this is my first post on Stack Overflow!)

I have a directory that has multiple .csv files, each with a column of Force and a column of Displacement data. I want to perform the same linear regression plot function to each of them without having to change the file name within the .py file. (Ideally I would like each equation to be an output, but for now I'm happy with multiple plots!)

So far I have:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from scipy import stats

values = pd.read_csv('RawData_1.csv')

slope, intercept, r_value, p_value, std_err = 
stats.linregress(values['Displacement'],values['Force'])

ax = sns.regplot(x="Displacement", y="Force", data=values, color='b', 
line_kws={'label':"y={0:.1f}x+{1:.1f}".format(slope,intercept)})

ax.legend()
plt.show()

I've tried implementing lines from other posts but having no luck. Any help is much appreciated. Thanks :)

1 Answer 1

1

you can use glob.glob() to get the contents of the directory as a list, then use a for loop to create a figure for each file:

import glob
# assuming you want to go ove the current working directory
files = glob.glob('RawData*.csv')
for f in files:
    values = pd.read_csv(f)

    slope, intercept, r_value, p_value, std_err = 
    stats.linregress(values['Displacement'],values['Force'])

    ax = sns.regplot(x="Displacement", y="Force", data=values, color='b', 

    line_kws={'label':"y={0:.1f}x+{1:.1f}".format(slope,intercept)})

    ax.legend()
    plt.show()
Sign up to request clarification or add additional context in comments.

4 Comments

It would be better to use glob.glob("*.csv") to avoid non-CSV files.
I didn't know glob before, thanks for the suggestion - it does simplify things
That works thank you! Can I ask what the meaning behind the "f" in "for f in files" and "...read_csv(f)" is?
When you iterate over a list, you name the element you're currently working on. so in our case what python does is: f = files[0] then it runs the loop's body with f being the first element. then f = files[1] and runs the loop's body again with f being the second. you can replace f with anything (e.g. for element in files) just don't forget to replace all occurrences of f in the loop's body. You can read more about for loops here: wiki.python.org/moin/ForLoop

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.