1
list = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

I need the above json loaded into a dataframe and tried the following but doesn't work

df = pd.DataFrame.from_dict(list, orient = 'index')
display(df)

Error:

TypeError: field 0: Can not merge type <class 'pyspark.sql.types.LongType'> and <class 'pyspark.sql.types.StringType'>

1
  • Please do not overwrite the built list! Try to name your variable differently. Commented Apr 15, 2021 at 8:44

4 Answers 4

2
data = pd.DataFrame([list])

For more info converting JSON to pandas DataFrames please check out link below :)

https://pandas.pydata.org/docs/reference/api/pandas.read_json.html

Sign up to request clarification or add additional context in comments.

Comments

0

You need to wrap the dictionary into a list before creating the dataframe:

data = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

df = spark.createDataFrame([data])

df.show()
+-----------+------------+--------+---+----+----+---------+---------------+------------------+
|description|isDeprecated|masterId|max| min|name|precision|signalTypeRefId|unitOfMeasureRefId|
+-----------+------------+--------+---+----+----+---------+---------------+------------------+
|        xyz|       false|       2|125|-125|name|        1|              4|                 1|
+-----------+------------+--------+---+----+----+---------+---------------+------------------+

Or you can convert it to a pandas dataframe and create a Spark dataframe from that, though you still need to wrap the dictionary into a list:

data = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

df = spark.createDataFrame(pd.DataFrame([data]))

df.show()
+--------+----+-----------+---------------+------------------+---------+----+---+------------+
|masterId|name|description|signalTypeRefId|unitOfMeasureRefId|precision| min|max|isDeprecated|
+--------+----+-----------+---------------+------------------+---------+----+---+------------+
|       2|name|        xyz|              4|                 1|        1|-125|125|       false|
+--------+----+-----------+---------------+------------------+---------+----+---+------------+

Comments

0


dct = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

df = pd.DataFrame.from_dict(dct, orient="index")
display(df)

"""
                        0
masterId                2
name                 name
description           xyz
signalTypeRefId         4
unitOfMeasureRefId      1
precision               1
min                  -125
max                   125
isDeprecated        False
"""

To have it as a row, use .transpose()

df.transpose()
"""
Out[15]: 
  masterId  name description     ...        min  max isDeprecated
0        2  name         xyz     ...       -125  125        False
"""

3 Comments

I am getting the following error: /databricks/spark/python/pyspark/sql/pandas/conversion.py:300: UserWarning: createDataFrame attempted Arrow optimization because 'spark.sql.execution.arrow.pyspark.enabled' is set to true; however, failed by the reason below: an integer is required (got type str) Attempting non-optimization as 'spark.sql.execution.arrow.pyspark.fallback.enabled' is set to true. warnings.warn(msg) TypeError: field 0: Can not merge type <class 'pyspark.sql.types.LongType'> and <class 'pyspark.sql.types.StringType'>
I don't have spark installed. Some strange interference between packages it seems ...
maybe if you have conda, create a new conda environment, install there python and then pandas only and try it out.
0
data = pd.json_normalize(list)

https://pandas.pydata.org/docs/reference/api/pandas.json_normalize.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.