I would like to optimize the code below. It works but I would like suggestions if it can be done more concisely and efficiently.
import os
import glob
import pandas as pd
import numpy as np
files = glob.glob(os.path.join('data','*.csv'))
dfs = []
for file in files:
variable = os.path.basename(file).split("_")[0] #split filename
df= pd.read_csv(file)
df['variable'] = variable #assign variable
dfs.append(df)
finalDf = pd.concat(dfs, ignore_index = True)
Any ideas ? Thank you in advance
Pandas 0.21.1 and Python 3.6.5