I have multiple files (thousands of files) in a folder, I'm reading these files using some glob function. What I want to do is print the first column (text file doesn't have a header column) and store it in some dataframe as I need to make tables based on calculations across multiple files. Here is my data (Sample data of two files)
File1:
O.U20,99.73000,75538,99.73500,51794,57821,99.73167,1062,4,,,,99.73173
O.Z20,99.70000,58974,99.70500,6748,35815,99.70250,468,3,99.70500,1132,2,99.70048
O.H21,99.79500,4274,99.80000,47043,49961,,,,99.79750,3424,3,99.79236
O.M21,99.81000,48584,99.81500,7062,37456,99.81167,243,3,99.81500,234,2,99.80975
S3.U20,3.000,1132,3.500,69740,3831,,,,3.250,1380,2,3.125
S3.Z20,-9.500,58855,-9.000,27304,3295,-9.250,468,2,-9.000,3730,2,-9.188
File 2:
O.U20,99.73000,75711,99.73500,51794,57821,99.73167,1062,4,,,,99.73173
O.Z20,99.70000,59142,99.70500,6748,35815,99.70250,468,3,99.70500,1132,2,99.70048
O.H21,99.79500,4447,99.80000,47043,49961,,,,99.79750,3424,3,99.79236
O.M21,99.81000,48765,99.81500,7062,37456,99.81167,243,3,99.81500,234,2,99.80975
S3.U20,3.000,1132,3.500,69740,3831,,,,3.250,1380,2,3.125
S3.Z20,-9.500,58855,-9.000,27477,3295,-9.250,468,2,-9.000,3730,2,-9.188
This is my code I'm working on
import glob
for file in glob.glob("C:/Users/Data/*"):
print(file)
myfile = open(file,"r")
lines = myfile.readlines()
for line in lines:
print(line.strip()[0])
This however print output (2 times, which is another issue as I want it to print the output just once)
O
O
O
O
S
S
I want the output to be
O.U20
O.Z20
O.H21
O.M21
S3.U20
S3.Z20
in a dataframe, so that I can create further tables. I thought of using multiple columns however O symbol has 4 characters and S symbol has 5 characters.