Reference

class cds_downloader.Downloader(cds_product, cds_filter, **kwargs)

The Downloader class provides common functionality for automated climate data download from cds.climate.copernicus.eu

In order to use the downloader, one has to create a file with user credentials api-how-to. Alternatively, define two environment variables ‘CDSAPI_URL’ and ‘CDSAPI_KEY’ with the user credentials from cds.

classmethod from_cds(cds_product, cds_filter, **kwargs)

Create Downloader from cds example

Parameters:
  • cds_product (string) – the cds product string
  • cds_filter (dict) – the cds filter dictionary
classmethod from_dict(dct_config)

Create Downloader from dictionary

Parameters:dct_config (dict) – a dictionary with keys ‘cds_product’ and ‘cds_filter’
classmethod from_json(json_config_path)

Create Downloader from json file

Parameters:json_config_path (string) – path to json config file
get_data(storage_path, split_keys=None, overwrite=False)

This method downloads requested data from climate data store.

Parameters:
  • storage_path (string) – target storage path as string
  • split_keys (list-like, optional) –

    The maximum single data request size depends on the copernicus climate data store and is automatically extracted from their metadata webapi. If split_keys=None, the method automatically chunks the cds request into multiple smaller requests and spawns a single process for each of them. Therefore, it extracts all list-like objects from the cds_filter (e.g. “year, “month, …) and splits the data into single requests/files.

    By setting split_keys as a list of keys from the cds_filter, one can manually control the splitting (e.g. split_keys=[“year”, “month”, “day”])

  • overwrite (boolean) – Default is False, Set to true if you want to overwrite existing files. This implies new requests on the climate data store.
Returns:

processes – List of download process objects

Return type:

list of multiprocessing.Process

Examples

Download small data collection with manual split_keys

>>> from cds_downloader import Downloader
>>> x = Downloader.from_cds(
...         "reanalysis-era5-single-levels",
...         {
...             "product_type": "reanalysis",
...             "format": "grib",
...             "variable": ["total_precipitation"],
...             "year": ["2020"],
...             "month": ["09"],
...             "day": ["01", "02", "03"],
...             "area": [50.7, 3.6, 42.9, 17.2]
...         },
...     )
...
>>> x.get_data("/tmp", ["year","month","day"])
get_data_for_date(storage_path, eval_date=datetime.datetime(2022, 1, 24, 11, 16, 30, 118379), **kwargs)

This method uses temporal information from the webapi and downloads data for a specified date.

Parameters:
  • storage_path (string) – storage path of data collection as string
  • eval_date (datetime.timedelta or str) – date of the data fields
get_latest_daily_data(storage_path, date_latency=None, **kwargs)

This method uses temporal information from the webapi and downloads only the latest day of the data. Hereby, one can define a latency in days with respect to the current datetime.

Parameters:
  • storage_path (string) – storage path of data collection as string
  • date_latency (datetime.timedelta or str or int, optional) – Latency with respect to the current utc date and time. If integer is passed the latency is interpreted as days.
update_data(storage_path, split_keys, date_until=datetime.datetime(2022, 1, 24, 11, 16, 30, 118384), date_latency=None, start_from_files=False)

This method provides update functionality for climate data collections retrieved with cds_downloader.Downloader.get_data()

It uses temporal information from cds metadata webapi and evaluates missing data. Redownload latest file in order to avoid missing data.

Under development, only temporal split_keys allowed: split_keys in [“year”, “month”, “day”, “time”]

Parameters:
  • storage_path (string) – storage path of data collection as string
  • split_keys (list of strings) – list of keys in cds_filter
  • date_until (datetime.datetime, optional) – update data collection until this date
  • date_latency (datetime.timedelta or str, optional) – temporal latency in relation to date_until
  • start_from_files (boolean, optional) – use first file of sorted file list as start reference date