replacing certain numeric values in a column using a condition and a loop

Question

i'm trying to figure out how to correct a few entry errors in a dataset i'm working with. i already fixed the problem, but i think the way i did it was inefficient because i replaced the values individually using a condition instead of iterating through the column and replacing the values using a condition.

in my dataset there were three observations for the corruption_score column that were off by a factor of 10. i wanted to loop through this column and replace any observation for that variable that is greater than 10 with itself divided by 10. example printout of my dataset is below.

# A tibble: 6 x 9
  country  year value deaths_per_100k region corruption_score  rank electricity_acc…
  <chr>   <dbl> <dbl>           <dbl> <chr>             <dbl> <dbl>            <dbl>
1 Iceland  2005 0.159            13.1 WE/EU               97     1              100
2 Finland  2005 0.232            13.7 WE/EU               96     2              100
3 New Ze…  2005 0.228            13.8 AP                  96     2              100
4 Finland  2006 0.271            13.1 WE/EU               9.6     1              100
5 Iceland  2006 0.156            12.8 WE/EU               9.6     1              100
6 New Ze…  2006 0.217            13.5 AP                  9.6     1              100

to solve this i tried to use a few different versions of this loop, including one in which the replacement operation is obs <- obs / 10, but i couldn't get anything to save outside of the loop. any advice? thanks in advance.

for (obs in wdi_gdp_long$corruption_score){
  
  if(obs > 10 & !is.na(obs)){
       
    wdi_gdp_long$corruption_score[obs] <- obs / 10
    
  }
  
}

Do idx <- x > 10 & !is.na(x); x[idx] <- x[idx] / 10, where x is wdi_gdp_long$corruption_score — markus
– markus, Commented Jun 28, 2020 at 17:34
(@markus I've always found the boolean indexing coercion to feel weird ... I use it, but I don't always feel happy with them ...) — r2evans
– r2evans, Commented Jun 28, 2020 at 18:32

Anshuman Kirty · Accepted Answer · 2020-06-28 19:07:51Z

1

The corruption_score can be mutated using the tidyverse library like in the code snippet below:

library(tidyverse)
library(magrittr)

wdi_gdp_long %<>%
    mutate (corruption_score = if_else(corruption_score > 10, corruption_score/10, corruption_score))

answered Jun 28, 2020 at 19:07

Anshuman Kirty

17610 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

replacing certain numeric values in a column using a condition and a loop

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related