-1

In the sample data frame

YYYYMM q1 q2 q3 q4 q5 q6 q7 q8 q9 q0 d1 d2 d3 d4 d5
197501  2 11 12 26 25 10 29 21 30 22  8  7 14  4 13
197502 27 22  8 20  6 26 21  4 19  9 10  1 11 12 23
197503  8  7 21 22 25  9  4 30  2 19 10 11 28 12 27
197504 29 28 27 17 19  2 30 16 18  3  9 10 11  8 13
197505 11 15 12 31 28 24  1 30 13 18  5  6 16  7 20
197506 24 10 27  8 23 28 25 26  9 22  2 12 29 30  1

After reading it

df1=pd.read_csv("Qdays_Ddays.docx",low_memory=False) #error_bad_lines=False)

Getting an error

ParserError: Error tokenizing data. C error: Expected 1 fields in line 3, saw 2

Please help to rectify it.

3
  • stackoverflow.com/questions/53256091/… Commented Mar 11, 2022 at 5:32
  • UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte Commented Mar 11, 2022 at 5:53
  • 1
    Microsoft Word files are not plain text files. Save your data as a plain text file. Commented Mar 11, 2022 at 5:57

1 Answer 1

0

You can't read docx with pandas, however you can read it with python-docx:

import docx
import pandas as pd
 
# open connection to Word Document
doc = docx.Document("test.docx")
 
# read in each paragraph in file
result = [p.text for p in doc.paragraphs]
print(result)

#Then you can convert it to Dataframe
df = pd.DataFrame(result)
#You can specify the return orientation.
df.to_dict('series')
#or 
df.to_dict('split')
#or
df.to_dict('records')
#or
df.to_dict('index')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.