Re-casting types of all column in SparkR data frame in a loop and/or apply function

Question

With SparkR 1.4.1 when working with a data frame of structure:

printSchema(dta)
root
 |-- date: timestamp (nullable = true)
 |-- valA: float (nullable = true)
 |-- valB: float (nullable = true)
 |-- ...

I would like to convert all of the existing columns to strings, without explicitly referring to each column by name.

Desired approach

Desired approach would loop over all columns:

# Quickly creating new data frame
dtaTmp <- select(dta, "date")

# Looping through each column of old data frame and adding string equivalent
# to a newly created data frame
for (i in seq_along(columns(dtaTmp))) {
    print(i)
    x  <- cast(eval(parse(text = paste(sep = "$", "dtaTmp", columns(dtaTmp)[i]))), 
           "string")
    dtaTmp <- withColumn(dtaTmp, (columns(dtaTmp)[i], x)
}

This fails with error: returnStatus == 0 is not TRUE. In effect I'm looking for a solution that would enable me to run equivalent of sapply(mtcars, as.character) on SparkR data frame.

Desired results

a new data frame should be of structure:

printSchema(desiredDta)
root
 |-- date: string(nullable = true)
 |-- valA: string(nullable = true)
 |-- valB: string(nullable = true)
 |-- ...

zero323 · Accepted Answer · 2017-01-13 17:25:53Z

1

You're hitting a bug in 1.4 branch where withColumn retains duplicated column names. The simplest solution is to use a single select with a list of columns:

select(df, lapply(columns(df), function(x) alias(cast(df[[x]], "string"), x)))

answered Jan 13, 2017 at 17:25

zero323

331k108 gold badges982 silver badges958 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Re-casting types of all column in SparkR data frame in a loop and/or apply function

Desired approach

Desired results

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

Desired approach

Desired results

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related