1

I want to get specific specific indicies from the netcdf file to create a 3 dimensional array (and then take the mean of these). However ncvar_get from the R package ncdf4 interprets as additional dimensions and not multiple indicies

library(ncdf4)

# Open NetCDF file (choosing this one as easy to access)
url <- "https://psl.noaa.gov/thredds/dodsC/Datasets/ncep.reanalysis.dailyavgs/surface_gauss/tmax.2m.gauss.2022.nc"
nc <- nc_open(url)

Extract hours 1 to 4, no problem:

# Extract continuous time array
nc_continuous <- ncvar_get(nc, "tmax", start = c(1, 1, 1), count = c(-1, -1, 4))  

But actually want just hours 3,5,8,100

# Extract specific non-continuous time indices (3rd, 5th, 8th, 100th)
time_indices <- c(3, 5, 8, 100)

# extract specific times
nc_non_cont <- ncvar_get(nc, var_name, start = c(1, 1, time_indicies), count = c(-1, -1, 1))

Try it on the count side

# extract specific times
nc_non_cont <- ncvar_get(nc, var_name, start = c(1, 1, 1), count = c(-1, -1, time_indicies))  

Neither works. I think this is easy but am just using the wrong syntax.

2 Answers 2

1

Get all then subset by 1st dim:

x <- ncvar_get(nc, "tmax", start = c(1, 1, 1), count = c(-1, -1, 4))

dim(x)
# [1] 192  94   4

#then subset
x[ time_indices, , ]

Or loop, if the data is too big:

lapply(time_indices, function(i){
  ncvar_get(nc, "tmax", start = c(i, 1, 1), count = c(1, -1, 4))  
})
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, however the data I am working with is very large, to read it all it and subset is slow. It was my understanding one of the main benefits of netcdf files is the ability to just read in what need. I believe this is possible and I am sure I have done this before, but either this function doesn't allow or I am using it incorrectly.
1

You need to make multiple calls to the THREDDS server, each getting only one time slice, then merge all the slices once you have read them all.

library(ncdf4)
library(abind)

# Open NetCDF file (choosing this one as easy to access)
url <- "https://psl.noaa.gov/thredds/dodsC/Datasets/ncep.reanalysis.dailyavgs/surface_gauss/tmax.2m.gauss.2022.nc"
nc <- nc_open(url)

# Extract specific non-continuous time indices (3rd, 5th, 8th, 100th)
time_indices <- c(3, 5, 8, 100)

# Now read individual slices by index
d <- lapply(time_indices,
            function(t) ncvar_get(nc, "tmax", start = c(1, 1, t), count = c(-1, -1, 1)))

# Now bind the list of individual slices to an array
d <- abind(d, along = 3)

# Wrap up
nc_close(nc)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.