2

I have an easy question to figure out:

value
1000
2500
5080
10009

I want to specify value to an interval:

value    Range
1000     0-1000
2500     1001-5000
5080     5001-10000
10009    10001-20000

I try this:

dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value < 5001, "1001-5000", ifelse(5000 < value < 10001, "5001-10000", "10001-20000")))

However, I got Error: unexpected '<' in "dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value <"

Any help?

EDIT:

This question is not asking for the best way to convert a continuous variable to a factor. It is asking for debugging help with the reproducible example:

library(data.table)
dt <- data.table(value = c(1000, 2500, 5080, 10009))
dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value < 5001, "1001-5000", ifelse(5000 < value < 10001, "5001-10000", "10001-20000")))
# produces the error above
5
  • 3
    See help("cut") for a better solution than nested ifelse. Commented Apr 3, 2018 at 7:48
  • And because you are using data.table: ifelse is slow. Commented Apr 3, 2018 at 7:50
  • 1
    Possible use of cut : stackoverflow.com/questions/13559076/… Commented Apr 3, 2018 at 7:51
  • 1
    Isn't your problem the line ifelse(1000 < value < 5001,.... as noted in the answer below? R does not take two-way inequalities. You need to break it down Commented Apr 3, 2018 at 8:50
  • 2
    To those voting to close due to a typo. It's not a typo. It's a non-obvious syntax error that programmers coming from other languages are likely to make. I'd say that makes it a useful question (and answer) for other users. Commented Apr 3, 2018 at 9:39

1 Answer 1

5

Like many (some) errors, it means what it says. Unlike python, R can't interpret 1000 < value < 5001. Instead you need to use 1000 < value & value < 5001

library(data.table)
dt <- data.table(value = c(1000, 2500, 5080, 10009))
dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value & value < 5001, "1001-5000", ifelse(5000 < value & value < 10001, "5001-10000", "10001-20000")))]
dt
   value       Range
1:  1000      0-1000
2:  2500   1001-5000
3:  5080  5001-10000
4: 10009 10001-20000

As @akrun mentioned, you may be better off with a factor. Here's an example:

dt[, Range := cut(value, breaks = c(0, 1001, 5001, 10001, 20001), labels = c("0-1000", "1001-5000", "5001-10000", "10001-20000"))]

This produces a data.table that displays the same way, but extracting the Range column will give you a factor corresponding to the ranges.

Sign up to request clarification or add additional context in comments.

3 Comments

thanks. your right. I am learning Python recently and I confused with R that I already know. Apprecate again.
Instead of multiple ifelse, you can use cut or findInterval
cut is great! maybe you can add a Python way to do this. I think some programmers may meet this kind of issue when they use different languages.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.