4

I am asked to write R output in two binary files, an index file and a main data file. There will be one matrix/block corresponding to each id in the index file. I have read about writing binary files in R on the internet but I am not sure how to specify the format so that I can achieve this format?

Also, can we specify short integer in R? He said he wants the numebers to be short intergets (two bytes) and I don't want what that means.

I appreciate any input! Thanks

3
  • 1
    A quick search using [r] binary file on StackOverflow reveals the following very similar question: stackoverflow.com/q/1635278/602276 Commented Aug 8, 2011 at 23:19
  • 1
    As @mdsummer writes, you can specify how to write integers of size 2, but you problem statement is quite vague. Is the matrix data integers or are the ids integers? Or perhaps the ids are strings? Commented Aug 9, 2011 at 0:52
  • Welcome to StackOverflow! If one of the answer here are what you need, you should mark it as an answer. Otherwise update your question to clarify what you need. You should also upvote answers (and questions) you like. Just click on the score in the upper left! Commented Aug 9, 2011 at 2:43

2 Answers 2

4

Since you didn't specify the problem very clearly, I made some assumptions in the sample code below. Given a list of matrices, it saves them to a .bin file and creates an .idx file with offsets. You can then load them back in again given an index. The 2-byte size you mentioned isn't used - it saves the matrix data as 8-byte doubles or 4-byte integers (but you could change that).

Here's how it's used:

mtx <- list(matrix(1:12,4), matrix(sin(1:12),4))
saveMatrixList("c:/foo", mtx)

loadMatrix("c:/foo", 1)
loadMatrix("c:/foo", 2)

...and here are the functions:

saveMatrixList <- function(baseName, mtxList) {
    idxName <- paste(baseName, ".idx", sep="")
    idxCon <- file(idxName, 'wb')
    on.exit(close(idxCon))

    dataName <- paste(baseName, ".bin", sep="")
    con <- file(dataName, 'wb')
    on.exit(close(con))

    writeBin(0L, idxCon)

    for (m in mtxList) {
        writeBin(dim(m), con)
        writeBin(typeof(m), con)
        writeBin(c(m), con) 
        flush(con)

        offset <- as.integer(seek(con))
        cat('offset', offset)
        writeBin(offset, idxCon)
    }

    flush(idxCon)
}

loadMatrix <- function(baseName = "data", index) {
    idxName <- paste(baseName, ".idx", sep="")
    idxCon <- file(idxName, 'rb')
    on.exit(close(idxCon))

    dataName <- paste(baseName, ".bin", sep="")
    con <- file(dataName, 'rb')
    on.exit(close(con))

    seek(idxCon, (index-1)*4)
    offset <- readBin(idxCon, 'integer')

    seek(con, offset)
    d <- readBin(con, 'integer', 2)
    type <- readBin(con, 'character', 1)
    structure(readBin(con, type, prod(d)), dim=d)
}
Sign up to request clarification or add additional context in comments.

Comments

2

See help(writeBin), size = 2 defines the allocation to each element (i.e. a two byte integer). But if you don't know what this means you probably will need a lot more information from your requester.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.