I've got this doubt on how to best proceed with the following situation: I would like to compare two datasets, one (model_data) is a raster (NetCDF format), with lat, lon and time as coordinates, being the spatial resolution of 0.5 degrees. The other dataset (observed_data) is actually a table that is geolocated, and I transformed it into a vector using a shapefile as reference (which is formed by multiple polygon representing counties).
My approach so far has been to rasterize the vectorized dataset at a very high resolution (0.0083 degrees), and then upscale it to the same 0.5 degrees found in the model_data. Therefore, it follows: vector -> raster (0.0083) -> raster (0.5). to upscale I have tried different techniques, from aggregate + mean to specific CDO techniques such as bilinear interpolation and first order conservative mapping.
However, I'm not entirely happy with the results so far, and I wonder if there is any flaws with this approach. I thought of perhaps trying the other way around, and converting the model_data into polygons or table and then doing the analysis at the county level.
Question: Are there any flaws in the approach described above and is there any alternative solution to the problem?
OBS: I am okay to use Python, R, and specific NetCDF tools such as CDO for this.