I have created one Python class to parse the text Input File in order to have a file in CSV format. Below is my class code:
import os
from os import listdir
class MyClass:
def __init__(self, filename, colsList):
self.filename=filename
self.colsList=colsList
def textcsvconverter(self,filename,colsList):
import csv
print("colsList:",colsList)
self.cols=colsList
outputfilename=(os.path.basename(filename))
print("outputfilename:",outputfilename)
fname_out =(outputfilename + '.csv')
with open(filename) as fin, open(fname_out, 'wt') as fout:
writer = csv.writer(fout, delimiter=",", lineterminator="\n")
for line in fin:
line = line.rstrip() # removing the '\n' and other trailing whitespaces
data = [line[c[0]:c[1]] for c in cols]
writer.writerow(data)
return fname_out
Now I have imported this class in my Pyspark code and trying to access the class method as shown below:
myobjectx = MyClass()
colsListA = [(0,1), (1,23), (23,31), (31,35),(35,41)]
outputfile1=myobjectx.textcsvconverter(finalpath1,colsListA)
Its giving me below error message:
TypeError: __init__() takes exactly 3 arguments (1 given)
filenameandcolsListare mandatory arguments. You should provide its values while instantiating the object . (ie, myobjectx = MyClass(your_filename, you_colsList) )