I'm working with semi-structured wiki data from a project I inherited from a colleague and having some trouble getting it tidy. It has a ton of issues but one of the first things I need to do is create sensible column names.
Suppose I have a data frame like this:
df <- data.frame(x1 = "ID: 4",
x2 = "Start Date: 1946/11/13",
x3 = "End Date: 1946/12/31")
x1 x2 x3
ID: 4 Start Date: 1946/11/13 End Date: 1946/12/31
I'd like to extract everything in the value before the colon and rename the columns based on this extract so that my data frame looks like this:
ID Start_Date End_Date
4 1946/11/13 1946/12/31
So far, I've learned that I can use str_extract from from the stringr package to pull out the strings of interest but I'm stumbling over how to use this resulting list for renaming column names.
library(tidyverse)
map(df, function(x) {str_extract(x,"[^:]+") %>% str_replace(" ", "_")})
Thanks for checking out this question :)