Why latitudes and longitudes are two dimensional arrays in netcdf file?

Question

I have netCDF file, which contains temperature data over some location. Data shape is 1450x900.

I am creating search functionality in my app, to locate temperature data with lat, lon values.

So I extracted lat and lon coordinates data from netCDf file, but I was expecting that they would be 1D arrays and instead got 2D arrays with 1450x900 shape for both coordinates.

So my question: why they are 2d arrays, instead of 1450 latitude values and 900 longitude values? Doesn't 1450 lat values and 900 lon values describe whole grid?

Lets say we have 4x5 square, indices for locating rightmost and bottom-most point of the grid will be [4, 5]. So my indices for x will be[1, 2, 3, 4] and for y: [1, 2, 3, 4, 5]. 9 indices in total are enough to locate any point on that grid (consisting of 20 cells). So why do lat (x) and lon (y) coordinates in netcdf file contain 20 indices separately (40 in total), instead of 4 and 5 indices respectively (9 in total)? Hope you get what confuses me.

Is it possible to somehow map those 2D arrays and "downgrade" to 1450 latitude values and 900 longitude values? OR it is ok as it is right now? How can I use those values for my intention? Do I need to zip lat lon arrays?

here are the shapes:

>>> DS = xarray.open_dataset('file.nc')
>>> DS.tasmin.shape
    (31, 1450, 900)
>>> DS.projection_x_coordinate.shape
    (900,)
>>> DS.projection_y_coordinate.shape
    (1450,)
>>> DS.latitude.shape
    (1450, 900)
>>> DS.longitude.shape
    (1450, 900)

consider that projection_x_coordinate and projection_y_coordinate are easting/northing values not lat/longs

here is the metadata of file if needed:

Dimensions:                       (bnds: 2, projection_x_coordinate: 900, projection_y_coordinate: 1450, time: 31)
Coordinates:
  * time                          (time) datetime64[ns] 2018-12-01T12:00:00 ....
  * projection_y_coordinate       (projection_y_coordinate) float64 -1.995e+0...
  * projection_x_coordinate       (projection_x_coordinate) float64 -1.995e+0...
    latitude                      (projection_y_coordinate, projection_x_coordinate) float64 ...
    longitude                     (projection_y_coordinate, projection_x_coordinate) float64 ...
Dimensions without coordinates: bnds
Data variables:
    tasmin                        (time, projection_y_coordinate, projection_x_coordinate) float64 ...
    transverse_mercator           int32 ...
    time_bnds                     (time, bnds) datetime64[ns] ...
    projection_y_coordinate_bnds  (projection_y_coordinate, bnds) float64 ...
    projection_x_coordinate_bnds  (projection_x_coordinate, bnds) float64 ...
Attributes:
    comment:        Daily resolution gridded climate observations
    creation_date:  2019-08-21T21:26:02
    frequency:      day
    institution:    Met Office
    references:     doi: 10.1002/joc.1161
    short_name:     daily_mintemp
    source:         HadUK-Grid_v1.0.1.0
    title:          Gridded surface climate observations data for the UK
    version:        v20190808
    Conventions:    CF-1.5

Each latitude and each longitude value are unique, if I correctly understood your question — wol
– wol, Commented Jul 30, 2020 at 9:22
@Bart so basically if I zip those lat lon variable values, I will get points for each location right. But my question is, why do I need 1305000 values for each lat and lon. Doesn't 1450 lat values and 900 lon values describe whole grid? Each latitude value is unique and I don't get, where those latitude values came from for just 1450x900 grid? — wol
– wol, Commented Jul 30, 2020 at 13:17
You have provided no information about the contents of the NetCDF file. It is impossible for anyone to provide an accurate answer to this answer — Robert Wilson
– Robert Wilson, Commented Jul 30, 2020 at 14:05
@RobertWilson I think I've worked out which file it is - see link in my answer. (To download a copy you would have to register for a CEDA account unless you have one already.) — alani
– alani, Commented Aug 2, 2020 at 19:00
@RobertWilson There is in fact also an equivalent OPeNDAP URL (with dodsC instead of fileServer), but to use it requires a .dodsrc file pointing to a valid short-lived credential for a CEDA user, so I didn't link to it as it is rather out of scope here -- the OP clearly has the data already, so this is really just for other people's reference. At least the fileServer link will take you interactively to register for an account if needed -- although it is a large file to download (~330MB). — alani
– alani, Commented Aug 2, 2020 at 19:09

alani · Accepted Answer · 2020-08-03 05:59:45Z

Your data adheres to version 1.5 of the Climate and Forecast conventions.

The document describing this version of the conventions is here, although the relevant section is essentially unchanged across many versions of the conventions.

See section 5.2:

5.2. Two-Dimensional Latitude, Longitude, Coordinate Variables

The latitude and longitude coordinates of a horizontal grid that was not defined as a Cartesian product of latitude and longitude axes, can sometimes be represented using two-dimensional coordinate variables. These variables are identified as coordinates by use of the coordinates attribute.

It looks like you are using the HadOBS 1km resolution gridded daily minimum temperature, and this file in particular:

http://dap.ceda.ac.uk/thredds/fileServer/badc/ukmo-hadobs/data/insitu/MOHC/HadOBS/HadUK-Grid/v1.0.1.0/1km/tasmin/day/v20190808/tasmin_hadukgrid_uk_1km_day_20181201-20181231.nc (warning: >300MB download)

As it states, the data is on a transverse mercator grid.

If you look at output from ncdump -h <filename> you will also see the following description of the grid expressed by means of attributes of the transverse_mercator dummy variable:

        int transverse_mercator ;
                transverse_mercator:grid_mapping_name = "transverse_mercator" ;
                transverse_mercator:longitude_of_prime_meridian = 0. ;
                transverse_mercator:semi_major_axis = 6377563.396 ;
                transverse_mercator:semi_minor_axis = 6356256.909 ;
                transverse_mercator:longitude_of_central_meridian = -2. ;
                transverse_mercator:latitude_of_projection_origin = 49. ;
                transverse_mercator:false_easting = 400000. ;
                transverse_mercator:false_northing = -100000. ;
                transverse_mercator:scale_factor_at_central_meridian = 0.9996012717 ;

and you will also see that the coordinate variables projection_x_coordinate and projection_y_coordinate have units of metres.

The grid in question is the Ordnance Survey UK grid using numeric grid references.
See for example this description of the OS grid (from Wikipedia).

If you wish to express the data on a regular longitude-latitude grid then you will need to do some type of interpolation. I see that you are using xarray. You can combine this with pyresample to do the interpolation. Here is an example:

import xarray as xr
import numpy as np
from pyresample.geometry import SwathDefinition
from pyresample.kd_tree import resample_nearest, resample_gauss

ds = xr.open_dataset("tasmin_hadukgrid_uk_1km_day_20181201-20181231.nc")

# Define a target grid. For sake of example, here is one with just 
# 3 longitudes and 4 latitudes.
lons = np.array([-2.1, -2., -1.9])
lats = np.array([51.7, 51.8, 51.9, 52.0])

# The target grid is regular (1-d lon, lat coordinates) but we will need
# a 2d version (similar to the input grid), so use numpy.meshgrid to produce this.
lon2d, lat2d = np.meshgrid(lons, lats)

origin_grid = SwathDefinition(lons=ds.longitude, lats=ds.latitude)
target_grid = SwathDefinition(lons=lon2d, lats=lat2d)

# get a numpy array for the first timestep
data = ds.tasmin[0].to_masked_array()

# nearest neighbour interpolation example
# Note that radius_of_influence has units metres

interpolated = resample_nearest(origin_grid, data, target_grid, radius_of_influence=1000)

# GIVES:
#      array([[5.12490065, 5.02715332, 5.36414835],
#             [5.08337723, 4.96372838, 5.00862833],
#             [6.47538931, 5.53855722, 5.11511239],
#             [6.46571817, 6.17949381, 5.87357538]])


# gaussian weighted interpolation example
# Note that radius_of_influence and sigmas both have units metres

interpolated = resample_gauss(origin_grid, data, target_grid, radius_of_influence=1000, sigmas=1000)

# GIVES:
#      array([[5.20432465, 5.07436805, 5.39693221],
#             [5.09069187, 4.8565934 , 5.08191639],
#             [6.4505963 , 5.44018209, 5.13774416],
#             [6.47345359, 6.2386732 , 5.62121948]])

wol · Accepted Answer · 2020-07-31 18:17:20Z

1

I figured an answer by myself.

As appeared 2D lat long arrays are used to define the "grid" of some location.

In other words, if we zip lat long values and project on the map, we will get "curved grid" (earth curvature is considered in other words) over some location, which then are used to create grid reference of location.

Hope its clear for anyone interested.

answered Jul 31, 2020 at 18:17

wol

4445 silver badges16 bronze badges

Collectives™ on Stack Overflow

Why latitudes and longitudes are two dimensional arrays in netcdf file?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related