R reading values of numeric field in file wrongly

Question

R is reading the values from a file wrongly. One can check if this statement is true with the following example:

A sample picture/snapshot which explains the problem areas is here enter image description here

(1) Copy paste the following 10 numbers into a test file (sample.csv)

1000522010609612
1000522010609613
1000522010609614
1000522010609615
1000522010609616
1000522010609617
971000522010609612
1501000522010819466
971000522010943717
1501000522010733490

(2) Read these contents into R using read.csv

X <- read.csv("./test.csv", header=FALSE)

(3) Print the output

print(head(X, n=10), digits=22)

The output I got was

                           V1
1     1000522010609612.000000
2     1000522010609613.000000
3     1000522010609614.000000
4     1000522010609615.000000
5     1000522010609616.000000
6     1000522010609617.000000
7   971000522010609664.000000
8  1501000522010819584.000000
9   971000522010943744.000000
10 1501000522010733568.000000

The problem is that rows 7,8,9,10 are not correct (check the sample 10 numbers that we considered before).

What could be the problem? Is there some setting that I am missing with my R - terminal?

I checked the above with R Studio and R console. It is not working for me :-( — acc
– acc, Commented Dec 6, 2014 at 11:08
You need to learn about integer limits and floating-point representations in computers, as was gently suggested by akrun in his answer. — Carl Witthoft
– Carl Witthoft, Commented Dec 6, 2014 at 13:46

akrun · Accepted Answer · 2014-12-06 11:23:37Z

1

You could try

library(bit64)
x <- read.csv('sample.csv', header=FALSE, colClasses='integer64')
x
#                   V1
#1     1000522010609612
#2     1000522010609613
#3     1000522010609614
#4     1000522010609615
#5     1000522010609616
#6     1000522010609617
#7   971000522010609612
#8  1501000522010819466
#9   971000522010943717
#10 1501000522010733490

If you load the bit64, then you can also try fread from data.table

library(data.table)
x1 <- fread('sample.csv')
x1
#                   V1
#1:    1000522010609612
#2:    1000522010609613
#3:    1000522010609614
#4:    1000522010609615
#5:    1000522010609616
#6:    1000522010609617
#7:  971000522010609612
#8: 1501000522010819466
#9:  971000522010943717
#10: 1501000522010733490

answered Dec 6, 2014 at 11:23

akrun

891k38 gold badges590 silver badges700 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

acc Over a year ago

I am retrieving the big integers from a data base into R (using RJDBCs dbGetQuery method). It seems to be automatically converting the data into a dataframe with the corrupted numbers (for bigints). Any suggestion on how we can solve the above problem particularly when we are using dbGetQuery using RJDBC package?

akrun Over a year ago

@acc Sorry, I don't have experience with RJDBC package. Could you ask it as a new post?

acc Over a year ago

Done that here stackoverflow.com/questions/27332693/…

akrun Over a year ago

@acc Thanks, somebody with RJDBC experience will respond.

Collectives™ on Stack Overflow

R reading values of numeric field in file wrongly

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related