skeltorch.Data

class skeltorch.Data

Skeltorch data class.

Class used to store data-related information such as path references, data features or even raw data. Used to provide a transparent bridge between the file system and the pipelines.

You are required to extend this class and implement its abstract methods. Check out examples to find real implementations of skeltorch.Data classes.

experiment

Experiment object.

Type:skeltorch.Experiment
logger

Logger object.

Type:logging.Logger
datasets

Dictionary containing the datasets of the train, validation and test splits. To be loaded using load_datasets(). Default: self.datasets = {'train': None, 'validation': None, 'test': None}.

Type:dict
loaders

Dictionary containing the loaders of the train, validation and test splits. To be loaded using load_loaders(). Default: self.loaders = {'train': None, 'validation': None, 'test': None}

Type:dict
create(self, data_path)

Initializes data-related attributes required in the experiment.

The purpose of this method is to create all data-related parameters which may take some time or that should be unique inside an experiment. Called during the creation of a new experiment.

Some examples of these type of tasks are:

  • Given a set of data samples, create appropriate splits.
  • Compute the mean and standard deviation of a set of data to normalize it.
  • Compute features of the data whose computation time would be too expensive if done on every iteration.

To preserve data, you must store it as a class attribute. It will be automatically saved using the``save()`` method during the execution of the init pipeline.

Parameters:data_path (str) – –data-path command argument.
init(self, experiment, logger)

Lazy-loading of skeltorch.Data attributes.

Parameters:
  • experiment (skeltorch.Experiment) – Experiment object.
  • logger (logging.Logger) – Logger object.
load(self, data_path, data_file_path, num_workers)

Loads class attributes from the binary file stored in data_file_path.

Parameters:
  • data_path (str) – –data-path command argument.
  • data_file_path (str) – Path where the binary file is stored.
  • num_workers (int) – Number of workers to use in the loaders.
load_datasets(self, data_path)

Loads the attribute self.datasets.

Creates and stores inside self.datasets the torch.utils.data.Dataset objects of the project.

Parameters:data_path (str) – –data-path parameter.
load_loaders(self, data_path, num_workers)

Loads the attribute self.loaders.

Creates and stores inside self.datasets the torch.utils.data.DataLoader objects of the project.

Parameters:
  • data_path (str) – –data-path command argument.
  • num_workers (int) – Number of workers to use in the loaders.
save(self, data_file_path: str)

Saves class attributes inside a binary file stored in data_file_path.

Parameters:data_file_path (str) – Path where the binary file will be stored.