0

I have a data frame like this:

<table>
  <tr><td>Task</td><td>UserStory</td><tr>
  <tr><td>123</td><td>abc</td><tr>
  <tr><td>4321</td><td>abc</td><tr>
  <tr><td>8763</td><td>abc</td><tr>
  <tr><td>9087</td><td>efg</td><tr>
  <tr><td>0652</td><td>efg</td><tr>
  <tr><td>7609</td><td>hij</td><tr>
</table>

I have collected the unique values for User Story into a vector. ("abc", "efg", "hij"). Let's say I've created this vector as "UserStories".

UserStories <- c("abc", "efg", "hij")

I would like to create a vector of matching Tasks for each value in the first vector, with the eventual goal of creating a second data frame with this structure:

<table>
  <tr><td>abc</td><td>1234</td><td>4321</td><td>8763</td><tr>
  <tr><td>efg</td><td>9087</td><td>0652</td><td>NA</td><tr>
  <tr><td>hij</td><td>609</td><td>NA</td><td>NA</td><tr>
</table>

I thinking of then rbind'ing them into a second data frame once I've padded the missing values with NA:

abc, 1234, 4321, 8763 efg, 9087, 0652, NA hij, 7609, NA, NA

I've been googling all afternoon without finding an approach.

I'd like to pass the UserStories vector to a function which would extract a series of vectors for all of the tasks associated with each UserStory.

Thanks in advance to any takers.

1 Answer 1

1

There are much better ways with packages to do this but I always try base R code first:

df <- data.frame(Task = c(123,4321,8763,9087,0652,7609), UserStory = c("abc","abc","abc","efg","efg","hij"))
# Splitting
df.split <- split(df$Task, df$UserStory)
# Combining
maxLength <- max(rapply(df.split, length))
# initialize
new <- list()
z <- NULL # hold the object for length editing to include NAs
for(i in 1:length(df.split)){
  z <- df.split[[i]]
  length(z) <- maxLength # edit the length
  new[[i]] <- c(names(df.split)[i], z)
}
final <- as.data.frame(do.call(rbind,new))
final
#   V1   V2   V3   V4
#1 abc  123 4321 8763
#2 efg 9087  652 <NA>
#3 hij 7609 <NA> <NA>
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! I, too prefer to use base R. It gives me the illusion that I understand what's going on as compared to using, say, dplyr. I'll get started with your solution.
Let me know if you have any questions

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.