matchzoo.engine.base_preprocessor

BasePreprocessor define input and ouutput for processors.

Module Contents

matchzoo.engine.base_preprocessor.validate_context(func)

Validate context in the preprocessor.

class matchzoo.engine.base_preprocessor.BasePreprocessor

BasePreprocessor to input handle data.

A preprocessor should be used in two steps. First, fit, then, transform. fit collects information into context, which includes everything the preprocessor needs to transform together with other useful information for later use. fit will only change the preprocessor’s inner state but not the input data. In contrast, transform returns a modified copy of the input data without changing the preprocessor’s inner state.

DATA_FILENAME = preprocessor.dill
context

Return context.

fit(self, data_pack:'mz.DataPack', verbose:int=1)

Fit parameters on input data.

This method is an abstract base method, need to be implemented in the child class.

This method is expected to return itself as a callable object.

Parameters:
  • data_packDatapack object to be fitted.
  • verbose – Verbosity.
transform(self, data_pack:'mz.DataPack', verbose:int=1)

Transform input data to expected manner.

This method is an abstract base method, need to be implemented in the child class.

Parameters:
  • data_packDataPack object to be transformed.
  • verbose – Verbosity. or list of text-left, text-right tuples.
fit_transform(self, data_pack:'mz.DataPack', verbose:int=1)

Call fit-transform.

Parameters:
  • data_packDataPack object to be processed.
  • verbose – Verbosity.
save(self, dirpath:typing.Union[str, Path])

Save the DSSMPreprocessor object.

A saved DSSMPreprocessor is represented as a directory with the context object (fitted parameters on training data), it will be saved by pickle.

Parameters:dirpath – directory path of the saved DSSMPreprocessor.
classmethod _default_units(cls)

Prepare needed process units.

matchzoo.engine.base_preprocessor.load_preprocessor(dirpath:typing.Union[str, Path]) → 'mz.DataPack'

Load the fitted context. The reverse function of save().

Parameters:dirpath – directory path of the saved model.
Returns:a DSSMPreprocessor instance.