reading a csv file with repeated row names in R

Question

I am trying to read a csv file with repeated row names but could not. The error message I am getting is Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed.

The code I am using is:

S1N657 <- read.csv("S1N657.csv",header=T,fill=T,col.names=c("dam","anim","temp"))

An example of my data is given below:

did <- c("1N657","1N657","1N657","1N657","1N657","1N657","1N657","1N657","1N657","1N657")
aid <- c(101,102,103,104,105,106,107,108,109,110)
temp <- c(36,38,37,39,35,37,36,34,39,38)

data <- cbind(did,aid,temp)

Any help will be appreciated.

Does this answer your question? duplicate 'row.names' are not allowed error — Brian D
– Brian D, Commented Apr 14, 2021 at 17:28

doug · Accepted Answer · 2010-11-02 00:35:16Z

35

the function is seeing duplicate row names, so you need to deal with that. Probably the easiest way is with row.names=NULL, which will force row numbering--in other words, it treats your first column as the first dimension and not as the row numbers, and so adds row numbers (consecutive integers starting with "1".

read.csv("S1N657.csv", header=T,fill=T, col.names=c("dam","anim","temp"), row.names=NULL)

edited Nov 2, 2010 at 0:35

answered Nov 1, 2010 at 4:51

doug

70.3k26 gold badges171 silver badges201 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

baz Over a year ago

thats right doug! i see that it has treated my first column (dam ids) as the first dimension like you said. i excluded the [,-1] bit and then rename my columns to take care of the extra one that was added. thanks alot!

VitoshKa Over a year ago

@Bazon, your header does not have a name for the first column. If you give it a name, the problem is solved automatically.

baz Over a year ago

hi doug, shouldn't there be a comma before row.names=NULL so that the script would be: read.csv("S1N657.csv", header=T,fill=T, col.names=c("dam","anim","temp"), row.names=NULL

doug Over a year ago

yes, thanks--a typo, just edited to add comma etween the last two arguments.

Travis Heeter Over a year ago

row.names=NULL doesn't actually fix the problem, it just covers it up. Please add a suggestion to check that number of headers matches number of values.

kohske · Accepted Answer · 2010-11-01 04:29:06Z

3

try this:

S1N657 <- read.csv("S1N657.csv",header=T,fill=T,col.names=c("dam","anim","temp"), 
          row.names = NULL)[,-1]

answered Nov 1, 2010 at 4:29

kohske

67.2k9 gold badges168 silver badges155 bronze badges

5 Comments

baz Over a year ago

hi kohske, that worked. Can you explain the last part to that code: [,-1] please? Thanks a lot!

baz Over a year ago

hi kohske, upon running the script, i found out that [,-1] part of the script removed the row names or my dam id (did).

kohske Over a year ago

yes you are right. if you need the first column (probably the duplicated names of each row), please remove [,-1] from the code above.

baz Over a year ago

koshke, i excluded the [,-1] part of the script as i still need my first column (dam ids) and rename my columns to take care of the extra one created. thanks alot!

Léo Léopold Hertz 준영 Over a year ago

I think better to use header = TRUE than remove literally the first row.

chen · Accepted Answer · 2014-04-16 13:29:52Z

3

Guessing your csv file was one converted from xlsx.Add a comma to the end of the first row ,remove the last row ,done

answered Apr 16, 2014 at 13:29

chen

511 bronze badge

3 Comments

avalancha Over a year ago

Your answer does not seem to address the question that was asked and it is low quality. Please consider elaborating a bit more

George Liu Over a year ago

This is actually helpful...As explained above by Travis Heeter, this could be due to column missing in the header. If that's the case, a way to resolve is open the file in a text editor, add a comma at the end of the first row and save it. It should be find afterwards.

Vesna Nov 4 at 12:19

This! had the same problem, added a comma at the end of the last row name, removed the last row and it suddenly read the file! Thank you so much

Community · Accepted Answer · 2020-06-20 09:12:55Z

An issue I had recently was that the number of columns in the header row did not match the number of columns I had in the data itself. For example, my data was tab-delimited and all of the data rows had a trailing tab character. The header row (which I had manually added) did not.

I wanted the rows to be auto-numbered, but instead it was looking at my first row as the row name. From the docs (emphasis added by me):

row.names a vector of row names. This can be a vector giving the actual row names, or a single number giving the column of the table which contains the row names, or character string giving the name of the table column containing the row names.

If there is a header and the first row contains one fewer field than the number of columns, the first column in the input is used for the row names. Otherwise if row.names is missing, the rows are numbered.

Using row.names = NULL forces row numbering. Missing or NULL row.names generate row names that are considered to be ‘automatic’ (and not preserved by as.matrix).

Adding an extra tab character to the header row made the header row have the same number of columns as the data rows, thus solving the problem.

Travis Heeter · Accepted Answer · 2016-10-05 15:48:11Z

1

In short, check your column names. If your first row is the names of columns, you may be missing one or more names.

Example:

"a","b","c"
a,b,c,d
a,b,c,d

The example above will cause a row.name error because each row has 4 values, but only 3 columns are named.

This happened to me when I was building a csv from an online resources.

answered Oct 5, 2016 at 15:48

Travis Heeter

14.2k14 gold badges99 silver badges146 bronze badges

Comments

Nathan Kuhn · Accepted Answer · 2019-07-17 20:20:31Z

1

I was getting the same "duplicate 'row.names' are not allowed" error for a small CSV. The problem was that somewhere outside of the 14x14 chart area I wanted there was a random cell with a space/other data.

Discovered the answer when I ran it "row.names = NULL" and there were multiple rows of blank data below my table (and therefore multiple duplicate row names all "blank").

Solution was to delete all rows/columns outside the table area, and it worked!

answered Jul 17, 2019 at 20:20

Nathan Kuhn

212 bronze badges

Comments

SkyR · Accepted Answer · 2018-11-13 10:24:13Z

0

in my case the problem came from the excel file. Although it seemed perfectly organized, it did not worked and I had always the message: Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed.

I tried to copy-paste my excel matrix to a new empty excel sheet and I retried to read it: it worked ! No error message anymore !

answered Nov 13, 2018 at 10:24

SkyR

1951 silver badge10 bronze badges

Collectives™ on Stack Overflow

reading a csv file with repeated row names in R

7 Answers 7

5 Comments

5 Comments

3 Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

5 Comments

5 Comments

3 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related