0

I would like to set column widths (for all the 3 columns) in this data set, as: anim=1-10; sireid=11-20; damid=21-30. Some columns have missing values.

anim=c("1A038","1C467","2F179","38138","030081")
sireid=c("NA","NA","1W960","1W960","64404")
damid=c("NA","NA","1P119","1P119","63666")

mydf=data.frame(anim,sireid,damid)
4
  • I'll be honest, I don't really know what you mean by column width. Could you explain that in more detail? Commented Oct 21, 2011 at 3:30
  • @joran: example: i would like to set the width or may be length for the first column ("anim") as 1-6. i am setting the length a according to the possible maximum number of characters. for instance, the last anim id is "030081" has 6 characters. i want to do the same to the two columns. thanks! Commented Oct 21, 2011 at 3:42
  • Sounds like Hong is right then; I've never used SAS so it never occurred to me that anyone would want to do this in R. Commented Oct 21, 2011 at 3:52
  • @joran: It's not just a function of R, though R can make use of this. As I mentioned below Hong's answer - a fixed width file is useful for the purposes of memory mapping: one will know exactly where to look for data as the layout lends itself to a very simple mapping function. As a result, one need not index every line nor parse every line, in order to get random access to data. Memory mapped files need not be binary, as with bigmemory. Being able to know where to look means that the data can be MASSIVE (though ASCII format is a naughty waste). Commented Oct 31, 2011 at 11:59

3 Answers 3

3
+50

From reading your question as well as your comments to previous answers, it seems to me that you are trying to create a fixed width file with your data. If this is the case, you can use the function write.fwf in package gdata:

Load the package and create a temporary output file:

library(gdata)
ff <- tempfile()

Write your data in fixed width format to the temporary file:

write.fwf(mydf, file=ff, width=c(10,10,10), colnames=FALSE)

Read the file with scan and print the results (to demonstrate fixed width output):

zz <- scan(ff, what="character", sep="\n")
cat(zz, sep="\n")

1A038      NA         NA        
1C467      NA         NA        
2F179      1W960      1P119     
38138      1W960      1P119     
030081     64404      63666    

Delete the temporary file:

unlink(ff)
Sign up to request clarification or add additional context in comments.

6 Comments

Also had to handle fw data in R. Note that there's also `read.fwf
@ran2 Quite right. There is a function read.fwf. I didn't want to use this in my example because I wanted to illustrate that each line is a single character string (read.fwf would have parsed the values.)
Hey, no offense – did not mean to improve your example only to add this comment for the sake of completeness ;) Particularly since it is from another package (utils) IIRC.
@Andrie: I am trying to get hold of the new data frame (with set column widths) and write it as a new file. How can I do this? ...I aplogise for this question.
That's what write.fwf does for you - it writes your data to a file.
|
2

You can also write fixed width output for numbers and strings using the sprintf() function, which derives from C's counterpart.

For instance, to pad integers with 0s:

sprintf("%012d",99)

To pad with spaces: sprintf("%12d",123)

And to pad strings:

sprintf("%20s","hello world")

The options for formatting are found via ?sprintf and there are many guides to formatting C output for fixed width.

Comments

0

It sounds like you're coming from a SAS background, where character variables should have explicit lengths specified to avoid unexpected truncations. In R, you don't need to worry about this. A character string has exactly as many characters as it needs, and automatically expands and contracts as its contents change.

One thing you should be aware of, though, is silent conversion of character variables to factors in a data frame. However, unless you change the contents at a later point in time, you should be able to live with the default.

3 Comments

i am using R to set this data which i am going to going to run in another program. thanks!
Ah, in that case, you're better off exporting it as a comma-delimited file (csv), rather than having fixed-width fields. While it's possible to export as fixed-width, it's probably more trouble than it's worth. Most programs will read csv files directly.
This is generally correct, however a fixed width file is useful for the purposes of memory mapping: one will know exactly where to look for data as the layout lends itself to a very simple mapping function. As a result, one need not index every line nor parse every line, in order to get random access to data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.