Let's assume that I have data in a df called data_full. From data_full I get
data_filtered <- data_full %>% filter(ua %in% c('a', 'b', 'c'))
Where,
data_filtered <- data.frame(ua = c(rep('a', 3), rep('b', 4), rep('c', 3)),
sp = c(rep('sp1', 3), rep('sp2', 3), rep('sp3', 2), rep('sp4',2)))
Now, I want to select the unique terms that occur in data_filtered$sp without breaking the pipe in the first code (data_filtered <- data_full %>%). Without a pipe I can simply use unique(data_filtered$sp), but how can I keep it in {dplyr} language? distinctworks in my above example, but in my dataset it doesn't since it keeps the uniqueness between ua. I tried to write some replication code with the ''error'' but I couldn't, so I'll print a section of the data (sorry)
This is after I pipe all the way from data_full into data_filtered. In my example it would be:
data_filtered <- data_full %>%
filter(ua %in% c('a', 'b', 'c')) %>% distinct(sp)
Is this because of "Select only unique/distinct rows from a data frame." on the function description? If so, how can I get the results I want? For example, only one "Alsophila setosa" in my print. I want the final result to be a vector of species names.
EDIT:
As requested:
structure(list(`Unidade Amostral` = c("1000", "1000", "1000",
"1000", "1000", "1000", "1000", "1001", "1001", "1001", "1001",
"1001", "1001", "1001", "1001", "1003", "1003", "1003", "1003",
"1003"), Espécie = c("Aspidosperma australe", "Cupania vernalis",
"Matayba elaeagnoides", "Nectandra megapotamica", "Ocotea puberula",
"Ocotea pulchella", "Parapiptadenia rigida", "Allophylus edulis",
"Araucaria angustifolia", "Hovenia dulcis", "Machaerium paraguariense",
"Matayba elaeagnoides", "Muellera campestris", "Nectandra megapotamica",
"Parapiptadenia rigida", "Clethra scabra", "Ilex brevicuspis",
"Ilex paraguariensis", "Matayba elaeagnoides", "Myrsine coriacea"
), n = c(4, 7, 14, 6, 9, 4, 5, 4, 8, 3, 4, 16, 10, 6, 4, 4, 13,
3, 42, 12)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -20L), groups = structure(list(`Unidade Amostral` = c("1000",
"1001", "1003"), .rows = structure(list(1:7, 8:15, 16:20), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), .drop = TRUE))

distiinctonly returns the first unique valueunique(data_filtered$sp)gives the same output as%>% distinct(sp)(except distinct returns a data.frame with single column whereas theuniqueon the vector returns the vector%>%mutate(sp = trimws(sp)) %>% distinct(sp)dat %>% ungroup %>% distinct(Espécie)