Dataset Processor Module¶
-
dataset_processor.
deseasonalize_dataset
(dataset)¶ Calculate daily climatology and subtract the climatology from the input dataset
Parameters: dataset ( dataset.Dataset
) – The dataset to convert.Returns: A Dataset with values converted to new units. Return type: dataset.Dataset
-
dataset_processor.
ensemble
(datasets)¶ Generate a single dataset which is the mean of the input datasets
An ensemble datasets combines input datasets assuming the all have similar shape, dimensions, and units.
Parameters: datasets – Datasets to be used to compose the ensemble dataset from. All Datasets must be the same shape. Return type: dataset.Dataset
-
dataset_processor.
mask_missing_data
(dataset_array)¶ Check missing values in observation and model datasets. If any of dataset in dataset_array has missing values at a grid point, the values at the grid point in all other datasets are masked. :param dataset_array: an array of OCW datasets
-
dataset_processor.
normalize_dataset_datetimes
(dataset, timestep)¶ Normalize Dataset datetime values.
Force daily to an hour time value of 00:00:00. Force monthly data to the first of the month at midnight.
Parameters: - dataset (
dataset.Dataset
) – The Dataset which will have its time value normalized. - timestep (
string
) – The timestep of the Dataset’s values. Either ‘daily’ or ‘monthly’.
Returns: A new Dataset with normalized datetime values.
Return type: - dataset (
-
dataset_processor.
safe_subset
(target_dataset, subregion, subregion_name=None)¶ Safely subset given dataset with subregion information A standard subset requires that the provided subregion be entirely contained within the datasets bounds. safe_subset returns the overlap of the subregion and dataset without returning an error.
Parameters: - subregion (
dataset.Bounds
) – The Bounds with which to subset the target Dataset. - target_dataset (
dataset.Dataset
) – The Dataset object to subset. - subregion_name (
string
) – The subset-ed Dataset name
Returns: The subset-ed Dataset object
Return type: - subregion (
-
dataset_processor.
spatial_regrid
(target_dataset, new_latitudes, new_longitudes, boundary_check=True)¶ Regrid a Dataset using the new latitudes and longitudes
Parameters: - target_dataset (
dataset.Dataset
) – Dataset object that needs spatially regridded - new_latitudes (
numpy.ndarray
) – Array of latitudes - new_longitudes (
numpy.ndarray
) – Array of longitudes - boundary_check (:class:'bool') – Check if the regriding domain’s boundaries are outside target_dataset’s domain
Returns: A new spatially regridded Dataset
Return type: - target_dataset (
-
dataset_processor.
subset
(target_dataset, subregion, subregion_name=None, extract=True, user_mask_values=[1])¶ Subset given dataset(s) with subregion information
Parameters: - subregion (
dataset.Bounds
) – The Bounds with which to subset the target Dataset. - target_dataset (
dataset.Dataset
) – The Dataset object to subset. - subregion_name (
string
) – The subset-ed Dataset name - extract (
boolean
) – If False, the dataset inside regions will be masked. - user_mask_value (
int
) – grid points where mask_variable == user_mask_value will be extracted or masked .
Returns: The subset-ed Dataset object
Return type: Raises: ValueError
- subregion (
-
dataset_processor.
temperature_unit_conversion
(dataset)¶ Convert temperature units as necessary Automatically convert Celcius to Kelvin in the given dataset.
Parameters: dataset – The dataset for which units should be updated. :type dataset; dataset.Dataset
Returns: The dataset with (potentially) updated units. :rtype: dataset.Dataset
-
dataset_processor.
temporal_rebin
(target_dataset, temporal_resolution)¶ Rebin a Dataset to a new temporal resolution
Parameters: - target_dataset (
dataset.Dataset
) – Dataset object that needs temporal rebinned - temporal_resolution (
string
) – The new temporal resolution
Returns: A new temporally rebinned Dataset
Return type: - target_dataset (
-
dataset_processor.
temporal_rebin_with_time_index
(target_dataset, nt_average)¶ Rebin a Dataset to a new temporal resolution
Parameters: - target_dataset (
dataset.Dataset
) – Dataset object that needs temporal rebinned - nt_average – Time resolution for the output datasets. It is the same as the number of time indicies to be averaged. length of time dimension in the rebinned dataset) = (original time dimension length/nt_average)
Returns: A new temporally rebinned Dataset
Return type: - target_dataset (
-
dataset_processor.
temporal_slice
(target_dataset, start_time, end_time)¶ Temporally slice given dataset(s) with subregion information. This does not spatially subset the target_Dataset
Parameters: - start_time (:class:'int') – start time
- end_time (:class:'datetime.datetime') – end time
- target_dataset (
dataset.Dataset
) – The Dataset object to subset.
Returns: The subset-ed Dataset object
Return type: Raises: ValueError
-
dataset_processor.
temporal_subset
(target_dataset, month_start, month_end, average_each_year=False)¶ Temporally subset data given month_index.
Parameters: - month_start (
int
) – An integer for beginning month (Jan=1) - month_end (
int
) – An integer for ending month (Jan=1) - target_dataset (Open Climate Workbench Dataset Object) – Dataset object that needs temporal subsetting
- average_each_year (:class:'boolean') – If True, output dataset is averaged for each year
Returns: A temporal subset OCW Dataset
Return type: Open Climate Workbench Dataset Object
- month_start (
-
dataset_processor.
variable_unit_conversion
(dataset)¶ Convert water flux or temperature variables units as necessary
For water flux variables, convert full SI units water flux units to more common units. For temperature, convert Celcius to Kelvin.
Parameters: dataset ( dataset.Dataset
) – The dataset to convert.Returns: A Dataset with values converted to new units. Return type: dataset.Dataset
-
dataset_processor.
water_flux_unit_conversion
(dataset)¶ Convert water flux variables units as necessary
Convert full SI units water flux units to more common units.
Parameters: dataset ( dataset.Dataset
) – The dataset to convert.Returns: A Dataset with values converted to new units. Return type: dataset.Dataset
-
dataset_processor.
write_netcdf
(dataset, path, compress=True)¶ Write a dataset to a NetCDF file.
Parameters: - dataset (
dataset.Dataset
) – The dataset to write. - path (
string
) – The output file path.
- dataset (