1

Due to a large NetCDF4 file, I get a MemoryError when I want to transform it into Pandas dataframe. But I don't need everything from the netCDF4 file, so I wanted to know if I could cut the file priorly, and after transforming into dataframe

My file looks like this: enter image description here

xr is for the xarray library Time variable contains all hours from 2019-01-01 to 2019-01-31 Unfortunately I can't filter on Copernicus website but I only need time at 09:00:00

Do you know how I could do it? Using xarray library or other way.

Thanks

1 Answer 1

3

You can use sel to filter your dataset:

import pandas as pd
import xarray as xr
import datetime

# Load a demo dataset
ds = xr.tutorial.load_dataset('air_temperature')

# Keep only 12:00 rows
df = ds.sel(time=datetime.time(12)).to_dataframe()

Output:

>>> df
                                       air
lat  time                lon              
75.0 2013-01-01 12:00:00 200.0  242.299988
                         202.5  242.199997
                         205.0  242.299988
                         207.5  242.500000
                         210.0  242.889999
...                                    ...
15.0 2014-12-31 12:00:00 320.0  296.889984
                         322.5  296.589996
                         325.0  295.690002
                         327.5  295.489990
                         330.0  295.190002

[967250 rows x 1 columns]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.