1

I have a Spark DataFrame on PySpark and I want to store its schema into another Spark DataFrame.

For example: I have a sample DataFrame df that looks like -

+---+-------------------+
| id|                  v|
+---+-------------------+
|  0| 0.4707538108432022|
|  0|0.39170676690905415|
|  0| 0.8249512619546295|
|  0| 0.3366111661094958|
|  0| 0.8974360488327017|
+---+-------------------+

I can look out at the schema of df by doing -

df.printSchema()

root
 |-- id: integer (nullable = true)
 |-- v: double (nullable = false)

What I require is a DataFrame that displays above information on df in two columns col_name and dtype.

Expected Output:

+---------+-------------------+
| col_name|              dtype|
+---------+-------------------+
|       id|            integer|
|        v|             double|
+---------+-------------------+

How do I achieve this? I cannot find anything regarding this. Thanks.

2
  • 1
    parallelize df.dtypes Commented Oct 23, 2019 at 16:44
  • I got the desired result by spark.createDataFrame(df.dtypes, ["col_name", "dtypes"]). Thanks. What do you mean by parallelize? Commented Oct 23, 2019 at 16:54

1 Answer 1

1

The simplest thing would be create a dataframe from df.dtypes:

spark.createDataFrame(df.dtypes, ["col_name", "dtype"]).show()
#+--------+------+
#|col_name| dtype|
#+--------+------+
#|      id|   int|
#|       v|double|
#+--------+------+

But if you wanted the dtype column to be as shown in printSchema, you could do so through df.schema

spark.createDataFrame(
    [(d['name'], d['type']) for d in df.schema.jsonValue()['fields']],
    ["col_name", "dtype"]
).show()
#+--------+-------+
#|col_name|  dtype|
#+--------+-------+
#|      id|integer|
#|       v| double|
#+--------+-------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.