0

I have a dataframe that looks like

 |-- alleleFrequencies: array (nullable = true)
 |    |-- element: double (containsNull = true)

element is an array of doubles

I wish to get this data into a numpy array, which I have naively done thus:

allele_freq1 = np.array(df1.select("alleleFrequencies").collect())

but this gives

[[list([0.5, 0.5])]
 [list([0.5, 0.5])]
 [list([1.0])]...

which isn't a simple 1D array like what I want

I've also tried

allele_freq1 = np.array(df1.select("alleleFrequencies")[0].collect())

but this gives

TypeError: 'Column' object is not callable

I've also tried

allele_freq1 = np.array(df1.select("alleleFrequencies[0]").collect())

but this gives

org.apache.spark.sql.AnalysisException: cannot resolve '`alleleFrequencies[0]`' given input columns...

How can I get the first item in the column alleleFrequencies placed into a numpy array?

I checked How to extract an element from a array in pyspark but I don't see how the solution there applies to my situation

5
  • Possible duplicate of How to extract an element from a array in pyspark Commented Nov 8, 2019 at 20:32
  • @pault the first you give gives an error that it cannot resolve the column name, and the link you gave gives no useful information Commented Nov 8, 2019 at 20:33
  • ok then use pyspark.sql.functions.col and getItem (as shown in the link I gave): np.array(df1.select(col("alleleFrequencies").getItem(0)).collect()). no useful information is a pretty broad statement. Commented Nov 8, 2019 at 20:36
  • thanks @pault the last comment pyspark.sql.functions.col: np.array(df1.select(col("alleleFrequencies").getItem(0)).collect()) gets the job done Commented Nov 8, 2019 at 20:37
  • That's exactly what's contained in the duplicate I linked. I bet selectExpr would probably work here too: np.array(df1.selectExpr("alleleFrequencies[0]").collect()) Commented Nov 8, 2019 at 20:37

1 Answer 1

0
allele_freq1 = np.array(df1.select(col("alleleFrequencies").getItem(0)).collect())
print(allele_freq1)
print(type(allele_freq1))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.