15

I have a Pandas dataframe that I'm inserting into an SQL database. I'm using Psycopg2 directly to talk to the database, not SQLAlchemy, so I can't use Pandas built in to_sql functions. Almost everything works as expected except for the fact that numpy np.NaN values get converted to text as NaN and inserted into the database. They really should be treated as SQL null values.

So, I'm trying to make a custom adapter to convert np.NaN to SQL null but everything I've tried results in the same NaN strings being inserted in the database.

The code I'm currently trying is:

def adapt_nans(null):
    a = adapt(None).getquoted()
    return AsIs(a)

register_adapter(np.NaN, adapt_nans)

I've tried a number of variations along this theme but haven't had any luck.

2
  • Personally I'd say that NaN should not be converted to NULL since they're not the same thing at all, but I can imagine contexts where I guess it could make sense. I'd use a BEFORE INSERT OR UPDATE ... FOR EACH ROW ... trigger to transform them. Commented Aug 20, 2015 at 13:46
  • 1
    I do understand the general differences between NaN and NULL, however, in this particular case they really are the same thing. The data is read into the dataframe from a flat file and where there is missing data Pandas inserts a NaN. Commented Aug 20, 2015 at 16:56

4 Answers 4

17

The code I was trying previously fails because it assumes that np.Nan is its own type when it is actually a float. The following code, courtesy of Daniele Varrazzo on the psycopg2 mailing list, does the job correctly.

def nan_to_null(f,
        _NULL=psycopg2.extensions.AsIs('NULL'),
        _Float=psycopg2.extensions.Float):
    if not np.isnan(f):
        return _Float(f)
    return _NULL

 psycopg2.extensions.register_adapter(float, nan_to_null)
Sign up to request clarification or add additional context in comments.

5 Comments

The function didn't seem to work with np.float64. No clue why. Changing if f is not _NaN: to if not np.isnan(f): fixed it. Otherwise perfect!
@JensdeBruijn I wonder if the NaN type for float64 is different than the NaN type for a regular float.
I am surprised too. If not that, could it be pandas? Using the above statement my error was fixed. I just want to show it as an option is someone runs into the same problem
I had to use the small fix propsed by @JensdeBruijn, now works great
Had to use the small fix as well. I proposed an edit.
5

If you are trying to insert Pandas dataframe data into PostgreSQL and getting the error for NaN, all you have to do is:

import psycopg2

output_df = output_df.fillna(psycopg2.extensions.AsIs('NULL'))

#Now insert output_df data in the table

2 Comments

what is the equivalent with psycopg 3?
TypeError: Field 'zzz' expected a number but got <psycopg2.extensions.AsIs object at 0x7f6bd5dbdb70>.
4

This answer is an alternate version of Gregory Arenius's Answer. I have replaced the conditional statement to work on any Nan value by simply checking if the value is equal to itself.

def nan_to_null(f,
         _NULL=psycopg2.extensions.AsIs('NULL')
         _Float=psycopg2.extensions.Float)):
    if f != f:
        return _NULL
    else:
         return _Float(f)

 psycopg2.extensions.register_adapter(float, nan_to_null)

If you check if a nan value is equal to itself you will get False. The rational behind why this works is explained in detail in Stephen Canon's answer.

Comments

1

I believe the easiest way is:

df.where(pd.notnull(df), None)

Then None is "translated": to NULL when imported to Postgres.

1 Comment

This only works if the column is not a numeric / int type.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.