Getting DataFrame's Column value results in 'Column' object is not callable

Question

For stream read from FileStore I'm trying to check if first column of first row value is equal to some string. Unfortunately while I access this column in any way e.g. launching .toList() on it, it throws

    if df["Name"].iloc[0].item() == "Bob":
TypeError: 'Column' object is not callable

I'm calling the customProcessing function from:

df.writeStream\
  .format("delta")\
  .foreachBatch(customProcessing)\
[...]

And inside this function I'm trying to get the value, but none of the ways of getting the data works. The same error is being thrown.

    def customProcessing(df, epochId):
      
      if df["Name"].iloc[0].item() == "Bob":
[...]

Is there a possibility for reading single cols? Or it is writeStream specific and I'm unable to use conditions on that input?

@mck thanks! That works. It was weird that error message does not suggest that this iloc is not supported in this case. If you write an answer, I'll mark it as solution. Thanks! — Lubu
– Lubu, Commented Dec 17, 2020 at 11:21

mck · Accepted Answer · 2020-12-17 11:58:45Z

1

There is no iloc for spark dataframes - this is not pandas; also there is no concept of index.

If you want to get the first item you could try

df.select('Name').limit(1).collect()[0][0] == "Bob"

answered Dec 17, 2020 at 11:58

mck

42.7k13 gold badges44 silver badges62 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Getting DataFrame's Column value results in 'Column' object is not callable

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related