1

In a data table, all the cells are numeric, and what i want do is to replace all the numbers into a string like this:

Numbers in [0,2]: replace them with the string "Bad"

Numbers in [3,4]: replace them with the string "Good"

Numbers > 4 : replace them with the string "Excellent"

Here's an example of my original table called "data.active": enter image description here

My attempt to do that is this:

x <- c("churches","resorts","beaches","parks","Theatres",.....)
for(i in x){
  data.active$i <- as.character(data.active$i)
  data.active$i[data.active$i <= 2] <- "Bad"
  data.active$i[data.active$i >2 && data.active$i <=4] <- "Good"
  data.active$i[data.active$i >4] <- "Excellent"
}

But it doesn't work. is there any other way to do this?

EDIT

Here's the link to my dataset GoogleReviews_Dataset and here's how i got the table in the image above:

library(FactoMineR)
library(factoextra)
data<-read.csv2(file.choose())
data.active <- data[1:10, 4:8]
1
  • 3
    The function cut is for breaking continuous numeric vectors into discrete factors. You'd be better off posting a reproducible example with more clear detail than "it doesn't work" Commented Feb 6, 2019 at 17:29

2 Answers 2

2

You can use the tidyverse's mutate-across combination to condition on the ranges:

library(tidyverse)

df <- tibble(
  x = 1:5, 
  y = c(1L, 2L, 2L, 2L, 3L), 
  z = c(1L,3L, 3L, 3L, 2L),
  a = c(1L, 5L, 6L, 4L, 8L),
  b = c(1L, 3L, 4L, 7L, 1L)
)

df %>% mutate(
  across(
    .cols = everything(),
    .fns = ~ case_when(
      .x <= 2             ~ 'Bad',
      (.x > 3) & (. <= 4) ~ 'Good',
      (.x > 4)            ~ 'Excellent',
      TRUE                ~ as.character(.x)
    )
  )
)

The .x above represents the element being evaluated (using a purrr-style functioning). This results in

# A tibble: 5 x 5
  x         y     z     a         b        
  <chr>     <chr> <chr> <chr>     <chr>    
1 Bad       Bad   Bad   Bad       Bad      
2 Bad       Bad   3     Excellent 3        
3 3         Bad   3     Excellent Good     
4 Good      Bad   3     Good      Excellent
5 Excellent 3     Bad   Excellent Bad      

For changing only select columns, use a selection in your .cols parameter for across:

df %>% mutate(
  across(
    .cols = c('a', 'x', 'b'),
    .fns = ~ case_when(
      .x <= 2             ~ 'Bad',
      (.x > 3) & (. <= 4) ~ 'Good',
      (.x > 4)            ~ 'Excellent',
      TRUE                ~ as.character(.x)
    )
  )
)

This yields

# A tibble: 5 x 5
  x             y     z a         b        
  <chr>     <int> <int> <chr>     <chr>    
1 Bad           1     1 Bad       Bad      
2 Bad           2     3 Excellent 3        
3 3             2     3 Excellent Good     
4 Good          2     3 Good      Excellent
5 Excellent     3     2 Excellent Bad      
Sign up to request clarification or add additional context in comments.

11 Comments

this works fine with the list u've given in the code, but with my dataset doesn't work. i've cheked the typeof my table and it is "list" just like your "df", but it doesn't work
@hamzasaber: Okay. Create a data set we can work with and we can manage fixing the code...
This is some warnings it gives me : Messages d'avis : 1: In Ops.factor(beaches, 2.7) : ‘<=’ not meaningful for factors 2: In Ops.factor(beaches, 2.7) : ‘>’ not meaningful for factors 3: In Ops.factor(beaches, 4.1) : ‘<=’ not meaningful for factors
@hamzasaber: Ahhh, I see. Your columns contain characters, not numbers. Perhaps use as.numeric(.) wherever I have ..
as.numeric(.) does removes the warnings BUT, my whole table is now filled with string "Excellent". Apparently as.numeric(.) changes everything to numbers greater than 100 and that's why it replace the values with "Excellent", cause using as.numeric(.), every value is greater than 4
|
1
x<-c('x','y','z')
df[,x] <- lapply(df[,x], function(x) 
                         cut(x ,breaks=c(-Inf,2,4,Inf),labels=c('Bad','Good','Excellent'))))

Data

df<-structure(list(x = 1:5, y = c(1L, 2L, 2L, 2L, 3L), z = c(1L,3L, 3L, 3L, 2L), 
a = c(1L, 5L, 6L, 4L, 8L),b = c(1L, 3L, 4L, 7L, 1L)), 
class = "data.frame", row.names = c(NA, -5L))

1 Comment

how to access columns named as strings not chars for example: x<-c("xxx","yyy","zzz") those are the names of my columns !!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.