I have a text file which contains m rows like the following:
0.4698537878,0.1361006627,0.2400000000,0.7209302326,0.0054816275,0.0116666667,1 0.5146649986,0.0449680289,0.4696969697,0.5596330275,0.0017155500,0.0033333333,0 0.4830107706,0.0684999306,0.3437500000,0.5600000000,0.0056351257,0.0116666667,0 0.4458490073,0.1175445834,0.2307692308,0.6212121212,0.0089169801,0.0200000000,0
I tried to read the file and copy it into a matrix like in the following code:
import string
file = open("datasets/train.txt",encoding='utf8')
for line in file.readlines():
tmp = line.strip()
tmp = tmp.split(",")
idx = np.vstack(tmp)
idy = np.hstack(tmp[12])
matrix = idx
I want to read the file as its into the matrix, in my sample data the matrix size should be: (4,6) and idy: (4,1) # the last line, the labels
but it stacked the last line of the file vertically !? like that:
0.4458490073,
0.1175445834,
0.2307692308,
0.6212121212,
0.0089169801,
0.0200000000,
0
any help?
npdefined?idx = np.vstack(tmp)doesn't concatenate 'tmp' vertically into an existing arrayidx; it just turnstmpinto a vertical array, then replacesidxwith that. You could fix your code by usingidx = []before the loop, thenidx.append(tmp)inside the loop, thenmatrix = np.array(idx)after the loop completes. Then use @jp_data_analysis's technique to split the matrix into id and data parts.