Dataset Loader Module¶
- Classes:
- DatasetLoader - Generate OCW Dataset objects from a variety of sources.
-
class
dataset_loader.
DatasetLoader
(*loader_opts)¶ Generate a list of OCW Dataset objects from a variety of sources.
Generate a list of OCW Dataset objects from a variety of sources.
Each keyword argument can be information for a dataset in dictionary form. For example: `` >>> loader_opt1 = {‘loader_name’: ‘rcmed’, ‘name’: ‘cru’,
‘dataset_id’: 10, ‘parameter_id’: 34}>>> loader_opt2 = {'path': './data/TRMM_v7_3B43_1980-2010.nc, 'variable': 'pcp'} >>> loader = DatasetLoader(loader_opt1, loader_opt2) ``
Or more conveniently if the loader configuration is defined in a yaml file named config_file (see RCMES examples): `` >>> import yaml >>> config = yaml.load(open(config_file)) >>> obs_loader_config = config[‘datasets’][‘reference’] >>> loader = DatasetLoader(*obs_loader_config) ``
As shown in the first example, the dictionary for each argument should contain a loader name and parameters specific to the particular loader. Once the configuration is entered, the datasets may be loaded using: `` >>> loader.load_datasets() >>> obs_datasets = loader.datasets ``
Additionally, each dataset must have a
loader_name
keyword. This may be one of the following: *'local'
- One or multiple dataset files in a local directory *'local_split'
- A single dataset split accross multiple files in alocal directory'esgf'
- Download the dataset from the Earth System GridFederation
'rcmed'
- Download the dataset from the Regional Climate ModelEvaluation System Database
'dap'
- Download the dataset from an OPeNDAP URL'podaac'
- Download the dataset from Physical OceanographyDistributed Active Archive Center
Users who wish to load datasets from loaders not described above may define their own custom dataset loader function and incorporate it as follows: >>> loader.add_source_loader(‘my_loader_name’, my_loader_func)
Parameters: loader_opts ( dict
) – Dictionaries containing the each dataset loader configuration, representing the keyword arguments of the loader function specified by an additional key called ‘loader_name’. If not specified by the user, this defaults to local.Raises: KeyError – If an invalid argument is passed to a data source loader function.
-
add_loader_opts
(*loader_opts)¶ A convenient means of adding loader options for each dataset to the loader.
Parameters: loader_opts ( dict
) – Dictionaries containing the each dataset loader configuration, representing the keyword arguments of the loader function specified by an additional key called ‘loader_name’. If not specified by the user, this defaults to local.
-
add_source_loader
(loader_name, loader_func)¶ Add a custom source loader.
Parameters: - loader_name (
string
) – The name of the data source. - loader_func – Reference to a custom defined function. This should
return an OCW Dataset object, and have an origin which satisfies origin[‘source’] == loader_name. :type loader_func:
callable
- loader_name (
-
load_datasets
()¶ Loads the datasets from the given loader configurations.
-
set_loader_opts
(*loader_opts)¶ Reset the dataset loader config.
Parameters: loader_opts ( dict
) – Dictionaries containing the each dataset loader configuration, representing the keyword arguments of the loader function specified by an additional key called ‘loader_name’. If not specified by the user, this defaults to local.