12

I have a .text file with following format, where fields (index number, name and message) are separated by \t (tab-separated):

712 ben     Battle of the Books
713 james   i used to be in TOM
714 tomy    i was in BOB once
715 ben Tournaments of Minds
716 tommy    Also the Lion in the upcoming school play
717 tommy   Can you guess
718 tommy    P
...

which I read with read_csv into a data frame:

 chat = pd.read_csv("f.text", sep = "\t", header = None, usecols = [2])

But the data frame just has 9812 rows while the ordinary file has more than 12428 rows (just 21 empty lines). It is quite weird. Do you have any idea? Thanks.

7
  • 1
    Can you post a download link to your data, difficult to answer here without posting guesses which is counter-productive Commented Feb 24, 2016 at 9:34
  • Very weird. Maybe is necessary parameter lineterminator of read_csv. Or you can try add index_col=None.How you check length of df ? By print len(df) ? Commented Feb 24, 2016 at 9:43
  • @jezrael just print df It will show the row number under the table. Same result with len(df) Commented Feb 24, 2016 at 10:02
  • Hmmm, interesting. If you omit usecols, length is still wrong? Commented Feb 24, 2016 at 10:11
  • 1
    Hmmm, try skip rows like chat = pd.read_csv("f.text", skiprows=9810, sep = "\t", header = None, usecols = [2]), then maybe check columns print df.columns and index print df.index Commented Feb 24, 2016 at 11:35

1 Answer 1

20

I think you need add parameter quoting:

import csv

chat = pd.read_csv("f.text",sep = "\t", header = None, usecols = [2], quoting=csv.QUOTE_NONE)
Sign up to request clarification or add additional context in comments.

2 Comments

jezrael can you actually explain why this works, i.e. why the unquoted read dropped lines? Otherwise it's not a reusable resource to other users.
OMG, this saved me! It looks like the default behavior for read_csv() expects everything to be wrapped in quotes. But if it is a tab separated file with no quotes, then you need to specify such, otherwise the data parsing goes awry

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.