Im working on a student project for machine learning, using python and pandas for analyzing webdata. Therefore I need to convert multiple lines of data (from one session) in one line. The session has a variable length, each row consist of 5 values : referer, ip, time, requestAdress, session, which i want to get stored into columns.
df_row = pd.DataFrame()
length_session = len(df_work[df_work['session'] == session])
for row in df_work[df_work.session == session].itertuples(): #tuple = referer, ip, time, requestAdress, session
for i in range(1,len(row)):
name = ['referer', 'ip', 'time', 'requestAdress', 'session']
df_row[str(name[i-1]) + str(length_session)] = row[i]
print row[i]
length_row-=1
print(df_row)
The Output is:
https://www.google.de/
x5d80e060.dyn.telefonica.de
2016-07-06 03:41:02
/kuenstlerbedarf/oelfarben/
-8730846718325754703
Empty DataFrame
Columns: [referer28, ip28, time28, requestAdress28, session28, referer27, ip27, time27, requestAdress27, session27, referer26, ip26, time26, requestAdress26, session26, referer25, ip25, time25, requestAdress25, session25, referer24, ip24, time24, requestAdress24, session24, referer23, ip23, time23, requestAdress23, session23, referer22, ip22, time22, requestAdress22, session22, referer21, ip21, time21, requestAdress21, session21, referer20, ip20, time20, requestAdress20, session20, referer19, ip19, time19, requestAdress19, session19, referer18, ip18, time18, requestAdress18, session18, referer17, ip17, time17, requestAdress17, session17, referer16, ip16, time16, requestAdress16, session16, referer15, ip15, time15, requestAdress15, session15, referer14, ip14, time14, requestAdress14, session14, referer13, ip13, time13, requestAdress13, session13, referer12, ip12, time12, requestAdress12, session12, referer11, ip11, time11, requestAdress11, session11, referer10, ip10, time10, requestAdress10, session10, referer9, ip9, time9, requestAdress9, session9, ...]
Index: []
So, the dynamic naming of the columns work, but the DataFrame remains empty. All I found according this problem was this and this Question.
I want to know why the assignment at: df_row[str(name[i-1]) + str(length_row)] = row[i] does not work, and how i can achieve my goal to fill the dynamically named columns with the given values.
A big THANX in advance!
appenddf_row, all i can see isrowname = ['referer', 'ip', 'time', 'requestAdress', 'session']out of the loops.