Using pivot longer to collapse data from multiple columns

Question

I've read through a number of similar posts and tutorials but am really struggling to understand the solution to my issue. I have a dataset that is wide, and when I make it longer - I want to collapse two sets of data (both duration and results).

For each participant (id), there is a category, and then a series of blood test results. Each test has both duration (in days) and a result (numeric value).

Here's how it looks now:

id	category	duration_1	results_1	duration_2	results_2	duration_3	results_3
01	diabetic	58	32	65	56	76	87
02	prediabetic	54	32	65	25	76	35
03	unknown	46	65	65	56	21	67

How I'd like it to be is:

id	category	duration	results
01	diabetic	58	32
01	diabetic	65	56
01	diabetic	76	87
02	prediabetic	54	32
02	prediabetic	65	25
02	prediabetic	76	35
03	unknown	46	65
03	unknown	65	25
03	unknown	21	67

I can get pivot longer to work for "results" - but I can't get it to pivot on both "results" and "duration".

Any assistance would be greatly appreciated. I'm still fairly new to R. Thanks!

Not sure if there is a master duplicate, but there's a few out there, including this one: stackoverflow.com/questions/61940984/… — thelatemail
– thelatemail, Commented Dec 17, 2021 at 1:34
Does this answer your question? Reshaping multiple sets of measurement columns (wide format) into single columns (long format) — camille
– camille, Commented Dec 17, 2021 at 1:56
Also stackoverflow.com/q/25925556/5325862, stackoverflow.com/q/28729506/5325862, and stackoverflow.com/q/47272371/5325862 — camille
– camille, Commented Dec 17, 2021 at 1:59

AndrewGB · Accepted Answer · 2021-12-17 00:56:58Z

3

One way is to separate the column names into two columns while you pivot (hence the names_sep below). Then, you can just drop the column number.

library(tidyverse)

df %>%
  tidyr::pivot_longer(!c(id, category),
                      names_to = c(".value", "num"),
                      names_sep = "_") %>%
  dplyr::select(-num)

Output

# A tibble: 9 × 4
  id    category    duration results
  <chr> <chr>          <dbl>   <dbl>
1 01    diabetic          32      23
2 01    diabetic          87      67
3 01    diabetic          98      78
4 02    prediabetic       43      45
5 02    prediabetic       34      65
6 02    prediabetic       12      12
7 03    unknown           32      54
8 03    unknown           75      45
9 03    unknown           43      34

Data

df <-
  structure(
    list(
      id = c("01", "02", "03"),
      category = c("diabetic", "prediabetic", "unknown"),
      duration_1 = c(32, 43, 32),
      results_1 = c(23, 45, 54),
      duration_2 = c(87, 34, 75),
      results_2 = c(67, 65, 45),
      duration_3 = c(98, 12, 43),
      results_3 = c(78, 12, 34)
    ),
    class = "data.frame",
    row.names = c(NA,-3L)
  )

edited Dec 17, 2021 at 0:56

answered Dec 17, 2021 at 0:50

AndrewGB

17k6 gold badges22 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Onyambu Over a year ago

why use dplyr:: etc? Youve already loaded tidyverse

Onyambu Over a year ago

also note pivot_longer(df, -c(id, category), names_to = ".value",names_pattern = '([^_]+)')

AndrewGB Over a year ago

@Onyambu As far as dplyr::, I just prefer (in general) to always be explicit. Great, thanks for that suggestion on the names pattern. That's a really clever way to handle it!

Onyambu Over a year ago

Its okay, though you do not need it once you have loaded the libraries. Thats the whole purpose of loading the library. eg you are not using magritrr::`%>%`

Tori Russell Over a year ago

Thanks that worked well. Much appreciated!

Collectives™ on Stack Overflow

Using pivot longer to collapse data from multiple columns

1 Answer 1

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related