1

I have a dataset that I am inhering of website logs that basically adds a new series of columns based on the number of pages visited. For example, if someone went to 2 pages on our website we'd have something like: visit_id, url_1, visit_datetime_1, url_2, visit_datetime_2. The problem is that some people visit just one page, and some visit 14. I want to simply this. See below for my current format and desired output. I guess I just don't understand how I will go through each column, when the number of fields are not always consistent (but the column names WILL be consistent: visit_id is a unique identifier, url_x, visit_datetime_x). I'm stumped.

Just to be clear below, visit_id 1000 visited 3 pages, 2000 visited 1 page, and 3000 visited 2 pages.

enter image description here

I've just never done anything like this before in Pandas and I'm just at a roadblock. I've gotten this far, which isn't far, but at least shows I'm trying. All help is appreciated.


visit_ids = []
urls = []
visit_datetimes = []

dataset = pd.read_excel('data.xlsx', engine='openpyxl')
df = pd.DataFrame(dataset)

for colname in df.iteritems():
    
    #do something to add to list
1
  • Is possible share Current format like text? Commented Feb 4, 2021 at 12:20

1 Answer 1

1

You can split last numbers after _ to MultiIndex and reshape by DataFrame.stack:

df = pd.read_excel('data.xlsx', engine='openpyxl')

df1 = df.set_index('visit_id')
df1.columns = df1.columns.str.rsplit('_', n=1, expand=True)

df1 = df1.stack().reset_index()
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, this worked. Maybe it was much more simple than I expected. I ran this line by line and I think I understand now. Thank you! As soon as it will let me, I will check this off as answered.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.