I'm trying to convert some of my pySpark code to Scala to improve performance. In AWS Glue (which uses Apache Spark) a script is automatically generated for you and it typically uses the DynamicFrame object to load, transform and write data out. However, the DynamicFrame class does not have all of the same functionalities as the DataFrame class and at times you have to convert back to a DataFrame object and vice versa to perform certain operations. Below is how I have converted from DataFrame to DynamicFrame objects in pySpark:
// PySpark version
// datasource is a DynamicFrame object
datasource0 = datasource.toDF().limit(5000000)
applymapping1 = DynamicFrame.fromDF(datasource0, glueContext, "applymapping1")
Is there an equivalent function to fromDF in Scala to revert back to a DynamicFrame object?