27

I am trying to read a csv file with repeated row names but could not. The error message I am getting is Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed.

The code I am using is:

S1N657 <- read.csv("S1N657.csv",header=T,fill=T,col.names=c("dam","anim","temp"))

An example of my data is given below:

did <- c("1N657","1N657","1N657","1N657","1N657","1N657","1N657","1N657","1N657","1N657")
aid <- c(101,102,103,104,105,106,107,108,109,110)
temp <- c(36,38,37,39,35,37,36,34,39,38)

data <- cbind(did,aid,temp)

Any help will be appreciated.

1

7 Answers 7

35

the function is seeing duplicate row names, so you need to deal with that. Probably the easiest way is with row.names=NULL, which will force row numbering--in other words, it treats your first column as the first dimension and not as the row numbers, and so adds row numbers (consecutive integers starting with "1".

read.csv("S1N657.csv", header=T,fill=T, col.names=c("dam","anim","temp"), row.names=NULL)
Sign up to request clarification or add additional context in comments.

5 Comments

thats right doug! i see that it has treated my first column (dam ids) as the first dimension like you said. i excluded the [,-1] bit and then rename my columns to take care of the extra one that was added. thanks alot!
@Bazon, your header does not have a name for the first column. If you give it a name, the problem is solved automatically.
hi doug, shouldn't there be a comma before row.names=NULL so that the script would be: read.csv("S1N657.csv", header=T,fill=T, col.names=c("dam","anim","temp"), row.names=NULL
yes, thanks--a typo, just edited to add comma etween the last two arguments.
row.names=NULL doesn't actually fix the problem, it just covers it up. Please add a suggestion to check that number of headers matches number of values.
3

try this:

S1N657 <- read.csv("S1N657.csv",header=T,fill=T,col.names=c("dam","anim","temp"), 
          row.names = NULL)[,-1]

5 Comments

hi kohske, that worked. Can you explain the last part to that code: [,-1] please? Thanks a lot!
hi kohske, upon running the script, i found out that [,-1] part of the script removed the row names or my dam id (did).
yes you are right. if you need the first column (probably the duplicated names of each row), please remove [,-1] from the code above.
koshke, i excluded the [,-1] part of the script as i still need my first column (dam ids) and rename my columns to take care of the extra one created. thanks alot!
I think better to use header = TRUE than remove literally the first row.
3

Guessing your csv file was one converted from xlsx.Add a comma to the end of the first row ,remove the last row ,done

3 Comments

Your answer does not seem to address the question that was asked and it is low quality. Please consider elaborating a bit more
This is actually helpful...As explained above by Travis Heeter, this could be due to column missing in the header. If that's the case, a way to resolve is open the file in a text editor, add a comma at the end of the first row and save it. It should be find afterwards.
This! had the same problem, added a comma at the end of the last row name, removed the last row and it suddenly read the file! Thank you so much
2

An issue I had recently was that the number of columns in the header row did not match the number of columns I had in the data itself. For example, my data was tab-delimited and all of the data rows had a trailing tab character. The header row (which I had manually added) did not.

I wanted the rows to be auto-numbered, but instead it was looking at my first row as the row name. From the docs (emphasis added by me):

row.names a vector of row names. This can be a vector giving the actual row names, or a single number giving the column of the table which contains the row names, or character string giving the name of the table column containing the row names.

If there is a header and the first row contains one fewer field than the number of columns, the first column in the input is used for the row names. Otherwise if row.names is missing, the rows are numbered.

Using row.names = NULL forces row numbering. Missing or NULL row.names generate row names that are considered to be ‘automatic’ (and not preserved by as.matrix).

Adding an extra tab character to the header row made the header row have the same number of columns as the data rows, thus solving the problem.

Comments

1

In short, check your column names. If your first row is the names of columns, you may be missing one or more names.

Example:

"a","b","c"
a,b,c,d
a,b,c,d

The example above will cause a row.name error because each row has 4 values, but only 3 columns are named.

This happened to me when I was building a csv from an online resources.

Comments

1

I was getting the same "duplicate 'row.names' are not allowed" error for a small CSV. The problem was that somewhere outside of the 14x14 chart area I wanted there was a random cell with a space/other data.

Discovered the answer when I ran it "row.names = NULL" and there were multiple rows of blank data below my table (and therefore multiple duplicate row names all "blank").

Solution was to delete all rows/columns outside the table area, and it worked!

Comments

0

in my case the problem came from the excel file. Although it seemed perfectly organized, it did not worked and I had always the message: Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed.

I tried to copy-paste my excel matrix to a new empty excel sheet and I retried to read it: it worked ! No error message anymore !

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.