Subsampler

class PlasmaCalcs.dimensions.subsampling.Subsampler(target, info)

Bases: object

interface to help with actually doing the subsampling, of a Subsamplable object.

target: Subsamplable
the Subsamplable object which self will subsample when called.
info: SubsamplingInfo, str, or json-like dict
str –> path to load subsampling info from.
can be filename or directory. see SubsamplingInfo.load for details.
dict –> subsampling info in json-like format, see SubsamplingInfo for details.
Below, “snap src” refers to the hashable object that identifies a snapshot.
Its exact behavior is not defined here and may vary across different Subsamplable subclasses.
E.g. it may be a filepath, or a Snap object.
The only expectation here is that each src must be hashable.
Expects Subsamplable target object to have:
- rawvars_loadable(src): list of all directly loadable vars within snap src.
- rawvar_load(var, src): load a single raw var from snap src.
OR override rawvar_load_and_subsample instead.
- rawvar_save(var, data, dst): save a single raw var to snap dst.
- target.snapdir: should tell abspath of folder with all snapshots.
- target.snaps.file_path(self): should tell array of snapshot filepaths.
- target.subsampling_info_cls: SubsamplingInfo subclass to use for this target.
Results will be saved to subsampling result paths; see SubsamplingResultPathManager for help.

Methods

dst_from_src(src)

return filepath for where to save subsampled data for this src.

path_manager(*[, make, exist_ok])

returns instance of path_manager_cls, based on self.target.snapdir.

srcs()

returns list of all snap srcs in self.target (before applying any subsampling)

srcs_and_src2vars()

returns (srcs, src2vars), where:

srcs_byvars()

dict of {var: [srcs in which to keep data for var]}

srcs_forall()

list of all srcs, subsampled by self.info['forall']['snaps'].

subsample(*[, subsampled_data_exist_ok])

apply subsampling to all data as appropriate (determined by self.info)

subsample_snap(src, *[, dst, dst_exists_ok, ...])

apply subsampling to all appropriate data vars in this snap.

dst_from_src(src)

return filepath for where to save subsampled data for this src.

Details determined by self.path_manager() and self.target.snap_src_to_filepath().
path_manager(*, make=True, exist_ok=True)

returns instance of path_manager_cls, based on self.target.snapdir.

Use this when creating subsampled data.

make: whether to create the ‘subsampling_result’ folder if it doesn’t exist yet. exist_ok: if False and any folders to make already exist, crash instead.

path_manager_cls

alias of SubsamplingResultPathManager

srcs()

returns list of all snap srcs in self.target (before applying any subsampling)

srcs_and_src2vars()

returns (srcs, src2vars), where:

srcs = [list of all snap srcs to keep]
src2vars = dict of {src: [vars to keep for src]} if relevant.
if keeping all vars from src, src2vars might exclude src key.
(if keeping all vars from all srcs, src2vars may be an empty dict.)
srcs_byvars()

dict of {var: [srcs in which to keep data for var]}

Subsampling will be relative to full list of snaps;

raise SubsamplingError if would need to write var at a snap where it was previously missing.
E.g. crash if var5 appears only every 5 snaps but info says slice to every 7 snaps for var5.
Any var with no snap subsampling indicated will be kept in all srcs where it was already.
srcs_forall()

list of all srcs, subsampled by self.info[‘forall’][‘snaps’].

subsample(*, subsampled_data_exist_ok=False)

apply subsampling to all data as appropriate (determined by self.info)

returns self.path_manager() instance with relevant paths.
(results will be saved to paths determined by SubsamplingResultPathManager;
probably ‘subsampling_result’ at the same directory-level as target.snapdir.)
Never overwrites any of the pre-subsampling data.
By default, refuses to overwrite any existing data at all.
subsampled_data_exist_ok: bool
whether it is okay for subsampled data file(s) to already exist before saving any data. Default False.
(Doesn’t interact with any subsampling_info files.)
subsample_snap(src, *, dst=None, dst_exists_ok=False, include_vars=None)

apply subsampling to all appropriate data vars in this snap.

returns abspath to dst where data was saved (or, None if no data was saved).
src: any hashable object
snap src to subsample.
dst: None or str
filepath for where to save subsampled data.
None –> self.dst_from_src(src)
include_vars: None or list of strs
list of vars to include in result.
None –> use self.target.rawvars_loadable(src)
(and do not apply any ‘snaps’ subsampling from subsampling_info)
if empty list, do not save any vars, and return None instead of path.