1

I am trying to read Oracle data table output in a dataframe which I need to compare against another dataframe.

Oracle has str value Unicode Character “ü” which is appearing as 'u' in dataframe.

Code I tried:

import pandas as pd
import cx_Oracle

conn = cx_Oracle.makedsn(host='hostname', port='1521', service_name= 'SomeName')
sqlconn = cx_Oracle.connect( user='Username', password='$$$$$', dsn=conn)
sqlquery = "Select statement"
df2 = pd.read_sql(sqlquery, sqlconn)

print(df2)
**UBERX**,2003-10-01 00:00:00,I,N/A,Not Available

Expected 
**ÜBERX**,2003-10-01 00:00:00,I,N/A,Not Available

If i export the output to csv

df2.to_csv('/home/user/05June_1_ORA.csv', index=False)

In Unix loc:

bash-4.2$ file -i *
05June_1_ORA.csv: text/plain; charset=us-ascii

This data is getting ingested to oracle using a csv and its encoding is utf-8

sourcefile_05June_1.csv:     text/plain; charset=utf-8

Please let me know how can I resolve it.

2
  • What version of python & pandas are you using? And you're positive that the ü character is in the SQL database? Commented Jun 8, 2020 at 4:31
  • Python 3.6.5 Pandas '0.23.0' Yes Oracle db has same value. Commented Jun 8, 2020 at 11:42

1 Answer 1

3

When you connect to the database, ensure that you set the encoding. This will become default in cx_Oracle 8, but for now, do this:

sqlconn = cx_Oracle.connect(user='Username', password='$$$$$', dsn=conn,
        encoding="UTF-8", nencoding="UTF-8")
Sign up to request clarification or add additional context in comments.

3 Comments

Great! Glad it was helpful!
AnthonyTuininga is the producer of cx_oracle @Rohit as far as I know ...
Yes, I am . :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.