I have a Spark DataFrame on PySpark and I want to store its schema into another Spark DataFrame.
For example:
I have a sample DataFrame df that looks like -
+---+-------------------+
| id| v|
+---+-------------------+
| 0| 0.4707538108432022|
| 0|0.39170676690905415|
| 0| 0.8249512619546295|
| 0| 0.3366111661094958|
| 0| 0.8974360488327017|
+---+-------------------+
I can look out at the schema of df by doing -
df.printSchema()
root
|-- id: integer (nullable = true)
|-- v: double (nullable = false)
What I require is a DataFrame that displays above information on df in two columns col_name and dtype.
Expected Output:
+---------+-------------------+
| col_name| dtype|
+---------+-------------------+
| id| integer|
| v| double|
+---------+-------------------+
How do I achieve this? I cannot find anything regarding this. Thanks.
df.dtypes