0

I am very new to R and trying to extract a specific element from a data frame and compare it with an integer.

I had a table saved in text file.

data-file image

I used the following to read the table.

mydata = read.table("file.txt");

In my case I want to compare an element say, say the first element of USERPOR (which is 1.0) with an integer 1.0 (so the comparison should return true).

The code I wrote was

mydata[[2,7]]

[1] 1.000
Levels: 1.000 10.0000 2.000 3.000 4.00 5.00 6.000 7.000 8.000 9.000 USERPROR

However, when I compared them, I got 'FALSE'. Can anybody tell why is that so?

> mydata[[2,7]]==1.0

[1] FALSE

1 Answer 1

2

Hmmmm. First, elements of a data.frame are ordinarily accessed using single brackets -- like mydata[2,7]. Double brackets will access a column, e.g. mydata[[2]] will return the second column. Thus, mydata[[7]][2] is the same as mydata[2,7].

Second, since your output includes a Levels: list, it appears that that variable is stored as a factor having levels of "1.000", "10.0000", ... "USERPROR" (odd enough that I'm guessing the data are entered incorrectly). Accordingly, I believe that in your example, mydata[2,7] == "1.000" would return TRUE.

In general, if you want to compare a numeric value with an integer, don't use a comparison value like 1.0, because thje .0 part forces it to be stored as floating-point, not integer. If the data are stored as floating-point, there may be enough roundoff that a number computed as 1.0 is not exactly equal to an integer 1. The reliable way to test it is to use round(mydata[2,7]) == 1.

Sign up to request clarification or add additional context in comments.

5 Comments

I tried round(mydata[2,7]) == 1. It is giving me the following error. Error in Math.factor(1L) : round not meaningful for factors. Please see my edited question.
Right. That's because what you have there is a factor, like I stated earlier - not a floating-point value. I repeat, I think your first issue is that the data are not stored correctly. Get the data in there right first -- otherwise, it's garbage in, garbage out.
Here's the problem. Using read.table to read the data, but include the argument header=TRUE. It appears the variable names are being included as part of the data.
Yes "header=true" fixed the problem. So the actual issue was the way the data was read. Thank you so much! I literally spent hours on this.
Good. I don't know why they defaulted to header=FALSE in read.table, but they did.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.