I have a dataframe that contains the full chat between the user and customer agent. I would like to extract just the messages from the user and create new rows from them with the same ticket id:
ticket_id = pd.DataFrame(["1","2"]).rename(columns={0:"Ticket-ID"})
full_chat = pd.DataFrame([
"User foo foo foo 12:12 PM, Agent bar bar bar 12:12 PM, User foo foo 12:13
PM, Agent bar bar 12:13 PM, User foo 12:14 PM, Agent bar 12:14 PM",
"User bar bar bar 12:12 PM, Agent foo foo foo 12:12 PM, User bar bar 12:13
PM"
]).rename(columns={0:"Full-Chat"})
merge_chat = pd.merge(ticket_id, full_chat, left_index=True, right_index=True, how='outer')
def _split_row(text):
cleaned_text = text.lower()
lines = re.findall(r"\b\w*user\b\ (.*?)\ *\d\d:\d\d*", cleaned_text)
for line in lines:
print(line.split())
print(merge_chat["Full-Chat"].apply(_split_row))
I would like it to be like:
Ticket-ID Full-Chat
1 foo foo foo
1 foo foo
1 foo
2 bar bar bar
2 bar bar