Using ERA5 data to plot temperature in West Africa

A quick/dirty tech blog today, getting to know some of Matplotlib’s extra features for generating some attractive plots!

Using yearly ERA5 temperature data from 1979 to present obtained from the Copernicus Climate Data Store, the data was masked by country using shapefiles from Natural Earth and then an average was taken for the area (see previous blogs on area averaging for information on how to do this).

This was plotted to view whether there was any notable trends. As expected, all countries appear to be increasing in temperature over time. When the plots are smoothed using a gaussian_filter, the rise in temperature shows a clear trend (highlighted with dotted plot).

The plotting theme is achieved by declaring the matplotlib code inside plt.style.context(‘Solarize_Light2’)

Then each line is given a hex value to precisely define the color of the plot line, this value is matched in the mpatches

First load python packages and csv file:

matplotlib plotting code:

 

METADATA GENERATION FOR CSV FILES

For any data produced with the intention of being downloaded and used by other users, it important to include information on the dataset. For example, details on the data origins should be provided, such as who produced the dataset, who can be contacted about the data and when/where it was produced. In addition, properties on the dataset itself, such as variable names and units of measurement all help the end user in comprehending the data.

For our current project C3S Energy (part of the C3S operational services), we are producing CSV files as various outputs from computations on climate and energy data. The raw data originates from a variety of sources, contains a range of units and time scales and with varying resolution between datasets. It is therefore critical to produce accurate metadata to represent these differences in our CSV files. There is no ‘standard’ format for including metadata in CSV files, so we followed the common approach (in climate science) of displaying the metadata in the first column of the CSV file, with one row per line of information. This approach makes it very easy for non-technical users to open and understand the CSV data in Excel, similarly users utilising a more programmatic approach can simply skip these lines when opening the file with languages like Python and R.

This blog serves as a ‘follow on’ from Tech Blog #5 and outlines constructing a python function that references the same JSON lookup table.

First, define the function and split the filename by it’s delimiters into a list of items:


 

We are only dealing with CSV and NetCDF files in C3S Energy, detect which one and save as variable:


 

Loop through filename items list of and JSON file in parallel:


 

If items in list match JSON, add those variables for use in metadata:


 

Check title and add appropriate unit measurement as variable:


 

Create a variable with metadata included:


 

Here is the function in full:


 

This code can also be found on my github page

FILENAME VERIFICATION WITH PYTHON AND JSON

In order to build and maintain a robust operational system, which relies on many datasets, file naming becomes an important factor in identifying and describing the data contained within.

In the case of climate science, this could involve descriptors such as data origin, bias adjustment, variable type, start/end dates, accumulated or instantaneous  measurements, grid resolutions etc.

This example uses a JSON (JavaScript Open Notation) file as a lookup table. JSON files are extremely lightweight and can be imported easily by most modern programming languages. In Python, these are imported as a dictionary data type, which behaves like a list of objects. In this example, it will technically be a ‘nested’ dictionary, as it has multiple levels.

The JSON lookup table shown below contains all the allowed elements for the projects filename structure, including the associated ‘long names’ for each item (more on this later). It also contains the position of the filename element and it’s character length.


 

JSON Lookup Table

As you can see, each dictionary key has a nested dictionary within it. The key corresponds to the filename element.  Here is a test example filename relevant to this project:

H_ERA5_ECMW_T639_GHI_0000m_Euro_025d_S200001010000_E200001012300_ACC_MAP_01h_NA-_noc_org_NA_NA—_NA—_NA—.nc

So referencing the JSON table, by eye it is possible to identify that the file contains historical Global Horizontal Irradiance ERA5 data, originating from ECMWF, amongst other things. This is all well and good, but what if you need to check hundreds of similar files automatically? This is where Python can be used to create some functions that call the JSON table and check the integrity of the file name string.

 

Import packages:


 

Set some font colours for printing to terminal:


 

Load JSON file as a dictionary:

 

A simple function to print the file name structure in the correct order, for reference purposes:


 

Calling this function will output the following:

 

Another simple function to print all the possible filename elements for reference


 

Calling this function will output the following:


 

 

Finally, this function will check the filename string against the JSON table and output if there are the correct amount of elements present in the string. This takes one argument, your filename as string (fname) as input.


 

Taking the example filename as a string and passing it to the function, gives the following output:


 

The motive behind creating these functions, is they can then be called from another Python script running on a system, be that a local machine, virtual machine, server, HPC etc.

This can be achieved simply by adding the following to the top of your script (the functions are saved in a python file called ‘filename_utilities.py’):


 

So what’s the use in having all the long names in the JSON? This comes in handy when we need to produce data in a human readable format, such as writing metadata and comments. The next blog will focus on parsing the JSON file to automatically generate metadata as comments into an outputted CSV file.

CALCULATING NUTS2 REGIONAL AVERAGES WITH LAND SEA MASK

This post serves as a continuation on the techniques described in my last blog post. So please familiarise with those steps beforehand.

NUTS2

As before, load in the NUTS shapefiles, this time selecting NUTS2.

NUTS2 is higher resolution and as a result, there are many more shapefiles. NUTS2 contains polygons at a regional (sub-country) level. In total there are 332 shapes for the Eurostat EU region.

Continue reading “CALCULATING NUTS2 REGIONAL AVERAGES WITH LAND SEA MASK”

ERA5 10M/100M ANALYSIS

I wanted to see if I could improve on the standard power law coefficient (1.389495) for calculating 100m wind speed from 10m data. The currently available ten years of ERA5 U and V netCDF wind components from CDS were concatenated, then calculated for wind speed using CDO; a handy collection of command-line operators to manipulate and analyse climate and numerical weather prediction data:

Continue reading “ERA5 10M/100M ANALYSIS”

Examining NetCDF Files in R

I am currently working for a non profit organisation, working on enhancing the interaction between the energy industry and the weather, climate and broader environmental sciences community.

Not coming from a climate science background, I had to become quickly accustomed to the terminology and technologies associated with this field of research. A commonly used file types produced to represent weather and climate data, are Network Common Data Files (aka NetCDF files).

Continue reading “Examining NetCDF Files in R”