This guide will walk you through how to:
xarray
, a standard Python tool for working with scientific dataThis guide will use standard Python tools as well as official ECMWF tooling for GRIB.
First you need to download all the GRIB files of the forecast model run you wish to process. It is up to you to do this.
The following commands shows how you could download a HARMONIE NEA SF forecast on Linux with the command line tools curl
and jq
:
# Latest model run:
curl https://dmigw.govcloud.dk/v1/forecastdata/collections/harmonie_nea_sf/items?api-key=??? | jq '[.features[].properties.modelRun]|unique|sort|.[-1]'
# Retrieve download links from latest model run, paste the output of the previous command into where it says ?modelRun\=
curl https://dmigw.govcloud.dk/v1/forecastdata/collections/harmonie_nea_sf/items\?modelRun\=???\&api-key=??? | jq -r '.features[]|[.asset.data.href,.id]|@csv' > latest_model_run_links
# Download the actual GRIB files
cat latest_model_run_links | awk -F, '{print $1, $2}' | xargs -L1 sh -c 'curl $0 > $1'
Note: you need to use your own API key in the curl
commands.
Dependencies are most easily installed through conda
, official conda documentation.
You need to install the following packages in your conda environment of choice:
conda install -c conda-forge xarray dask netCDF4 bottleneck eccodes cfgrib
DMI’s HARMONIE model uses some parameters defined by DMI themselves. In order for our tooling to read and understand
these parameters, it needs these configuration tables.
This link explains how to install them by
downloading a tarball, extracting the folder and defining the environment variable ECCODES_DEFINITION_PATH
to point
at the extracted folder.
By installing the eccodes
package in the “Setup” section of this guide, you now have access to the ECMWF GRIB tools.
These are command line tools for working with GRIB.
Using grib_ls
on one of your downloaded GRIB files will show metadata about contained parameters, which levels the
parameter exists in, what type of levels are used etc.
In our example we will work with the temperature parameter, so using grib_ls
and looking for the temperature
parameter (which has short name t
) we will see:
> grib_ls -p typeOfLevel,level,shortName HARMONIE_NEA_SF_20XX-XX-XXTXXXXXXZ_20XX-XX-XXTXXXXXXZ.grib
typeOfLevel level shortName
...
heightAboveGround 50 t
heightAboveGround 100 t
heightAboveGround 150 t
heightAboveGround 250 t
...
heightAboveGround 2 t
heightAboveGround 0 t
...
hybrid 65 t
...
heightAboveSea 2 t
...
heightAboveGround 760 t
heightAboveGround 772 t
heightAboveGround 802 t
heightAboveGround 950 t
...
The output from this command tells us that temperature exists in various levels. It also exists in a few different
typeOfLevel
’s, like height above ground, height above sea and “hybrid”. In this example we want to work with
temperature forecasted in heights above ground.
Choosing a typeOfLevel
to work with is necessary when later importing the data into xarray
. Our tooling simply does
not know how to combine several different typeOfLevel
’s into one dataset. We could import several parameters into one
dataset, as long as they share typeOfLevel
.
In order to filter out the temperature parameter from the GRIB file, we use the command grib_filter
, full documentation here.
grib_filter
needs a filtering script that tells it to choose GRIB data with the shortName
as t
and typeOfLevel
as heightAboveGround
:
# filter_temp.txt
if (shortName is "t" and typeOfLevel is "heightAboveGround") {
write "./temp.grib";
}
The write "./temp.grib"
determines the output file created by grib_filter
.
grib_filter
will also be able to aggregate GRIB files if they are all given as input parameters. We can now create a GRIB
file containing only temperatures defined in heights above ground with all the time steps aggregated into one file:
grib_filter filter_temp.txt HARMONIE_NEA_SF_*
Running grib_ls -p shortName,level,stepRange temp.grib
shows us that our output file temp.grib
contains only the
temperature parameter in various level
’s and stepRange
’s. The step range defines the step of the forecast since
the beginning of the forecast. Because the HARMONIE model works with hours, a stepRange
of 10 would be 10 hours
after the beginning of the forecast.
This shows us grib_filter
has filtered everything except temperature above ground and has aggregated all our time steps!
xarray
Thanks to installing the cfgrib
package, it is almost trivial to import GRIB data from HARMONIE into xarray
:
import xarray as xr
# For very large files, maybe adjust chunk size accordingly
ds = xr.open_dataset('temp.grib', engine="cfgrib")
This will index the GRIB file, and these indexes will show up as files in your folder with the name: temp.grib.XXX.idx
.
Do NOT delete this file.
You can now work with the data as a regular xarray
dataset.
It might be worth noting that the dataset does not contain the data in a regular lat/lon grid, but rather the lat/lon coordinates are defined in a 2-dimensional array as shown here:
>>> ds.coords
Coordinates:
time datetime64[ns] ...
* step (step) timedelta64[ns] ...
heightAboveGround float64 ...
latitude (y, x) float64 44.38 44.39 44.4 ... 72.64 72.64 72.64
longitude (y, x) float64 -6.256 -6.228 -6.2 ... 36.87 36.94 37.01
valid_time (step) datetime64[ns] ...
This is because the HARMONIE model uses a rotated lat/lon grid. The un-rotated coordinates are in a regular grid around the equator and prime meridian. For a more official explanation, please see DMI’s documentation.
cfgrib
has rotated all the coordinates for us, so latitude
and longitude
contains the real world coordinates. The
coordinate at ds.coords['longitude'][4][5]
contains the longitude for
ds.data_vars[some_var][some_step][some_height][4][5]
, same goes for latitude.
One might assume we need to use the GRIB command line tool grib_to_netcdf
installed through eccodes
. This
does not work for rotated grids, which HARMONIE sadly is.
Luckily, xarray
has a built-in method for exporting to NetCDF which works:
ds.to_netcdf('temp.nc')