I'm trying to extract weather data from a netCDF file based on a variable. The .nc file contains 14 variables and 2 dimensions. I would like to extract all the data of 14 variables related to the value of a the first variable. The data is from the dutch Metrological Institute and can be found here.
Data is load in Python using the netCDF4 module like this:
import netCDF4 as nc
filename = r'path/file.nc'
dataset = nc.Dataset(filename)
Printed variables and dimensions:
dataset.variables.keys()
Out[67]: odict_keys(['station', 'time', 'lat', 'lon', 'DDVEC', 'FHVEC', 'TG', 'RH', 'UG', 'EV24', 'PG', 'iso_dataset', 'product', 'projection'])
dataset.dimensions.keys()
Out[68]: odict_keys(['station', 'time'])
I would like to extract the data for specific 'station' and put it in a pandas DataFrame to perform some calculations.
I tried something like this to extract the data however I know that this isn't the way netCDF files work but I can't figure out how.
df = dataset['344',:,:,:,:,:,:,:,:,0,0,0,0,0]
Summary question: is there a way to extract data for a certain station and put it in a pandas DataFrame?
SOLUTION
import pandas as pd
import xarray as xr
# Open netCDF file and convert to dataframe
open_netcdf = xr.open_dataset(filename)
dataset = open_netcdf.to_dataframe()
# Select data from a tuple index based on station number: 391
df = dataset.iloc[dataset.index.get_level_values(0) == '391',:]