0

R is reading the values from a file wrongly. One can check if this statement is true with the following example:

A sample picture/snapshot which explains the problem areas is hereenter image description here

(1) Copy paste the following 10 numbers into a test file (sample.csv)

1000522010609612
1000522010609613
1000522010609614
1000522010609615
1000522010609616
1000522010609617
971000522010609612
1501000522010819466
971000522010943717
1501000522010733490

(2) Read these contents into R using read.csv

X <- read.csv("./test.csv", header=FALSE)

(3) Print the output

print(head(X, n=10), digits=22)

The output I got was

                           V1
1     1000522010609612.000000
2     1000522010609613.000000
3     1000522010609614.000000
4     1000522010609615.000000
5     1000522010609616.000000
6     1000522010609617.000000
7   971000522010609664.000000
8  1501000522010819584.000000
9   971000522010943744.000000
10 1501000522010733568.000000

The problem is that rows 7,8,9,10 are not correct (check the sample 10 numbers that we considered before).

What could be the problem? Is there some setting that I am missing with my R - terminal?

3
  • It works all right for me... Commented Dec 6, 2014 at 11:06
  • I checked the above with R Studio and R console. It is not working for me :-( Commented Dec 6, 2014 at 11:08
  • You need to learn about integer limits and floating-point representations in computers, as was gently suggested by akrun in his answer. Commented Dec 6, 2014 at 13:46

1 Answer 1

1

You could try

library(bit64)
x <- read.csv('sample.csv', header=FALSE, colClasses='integer64')
x
#                   V1
#1     1000522010609612
#2     1000522010609613
#3     1000522010609614
#4     1000522010609615
#5     1000522010609616
#6     1000522010609617
#7   971000522010609612
#8  1501000522010819466
#9   971000522010943717
#10 1501000522010733490

If you load the bit64, then you can also try fread from data.table

library(data.table)
x1 <- fread('sample.csv')
x1
#                   V1
#1:    1000522010609612
#2:    1000522010609613
#3:    1000522010609614
#4:    1000522010609615
#5:    1000522010609616
#6:    1000522010609617
#7:  971000522010609612
#8: 1501000522010819466
#9:  971000522010943717
#10: 1501000522010733490
Sign up to request clarification or add additional context in comments.

4 Comments

I am retrieving the big integers from a data base into R (using RJDBCs dbGetQuery method). It seems to be automatically converting the data into a dataframe with the corrupted numbers (for bigints). Any suggestion on how we can solve the above problem particularly when we are using dbGetQuery using RJDBC package?
@acc Sorry, I don't have experience with RJDBC package. Could you ask it as a new post?
@acc Thanks, somebody with RJDBC experience will respond.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.