I'm just starting to dive into Pyspark.
There's this dataset which contains some values I'll demonstrate below to ask the query I'm not able to create.
This is a sample of the actual dataset which contains roughly 20K rows. I'm reading this CSV file in pyspark shell as data frame. Trying to convert some basic SQL queries on this data to get hands on. Below are one such query I'm not able to:
1. Which country has the least number of Government Type (4th Column).
There are other queries I've manually created myself that I can do in SQL but I'm just stuck in understanding the one. If I get an idea for this, it'll be fairly relatable to address other ones.
This is the only line I can create after much bugging:
df.filter(df.Government=='Democratic').select('Country').show()
I'm not sure how to approach this problem statement. Any ideas?
