matchzoo.auto.preparer
¶
Package Contents¶
Functions¶
-
class
matchzoo.auto.preparer.
Preparer
(task: BaseTask, config: typing.Optional[dict] = None)¶ Bases:
object
Unified setup processes of all MatchZoo models.
config is used to control specific behaviors. The default config will be updated accordingly if a config dictionary is passed. e.g. to override the default bin_size, pass config={‘bin_size’: 15}.
See tutorials/automation.ipynb for a detailed walkthrough on usage.
Default config:
- {
# pair generator builder kwargs ‘num_dup’: 1,
# histogram unit of DRMM ‘bin_size’: 30, ‘hist_mode’: ‘LCH’,
# dynamic Pooling of MatchPyramid ‘compress_ratio_left’: 1.0, ‘compress_ratio_right’: 1.0,
# if no matchzoo.Embedding is passed to tune ‘embedding_output_dim’: 50
}
- Parameters
task – Task.
config – Configuration of specific behaviors.
Example
>>> import matchzoo as mz >>> task = mz.tasks.Ranking(losses=mz.losses.RankCrossEntropyLoss()) >>> preparer = mz.auto.Preparer(task) >>> model_class = mz.models.DenseBaseline >>> train_raw = mz.datasets.toy.load_data('train', 'ranking') >>> model, prpr, dsb, dlb = preparer.prepare(model_class, ... train_raw) >>> model.params.completed(exclude=['out_activation_func']) True
-
prepare
(self, model_class: typing.Type[BaseModel], data_pack: mz.DataPack, callback: typing.Optional[BaseCallback] = None, preprocessor: typing.Optional[BasePreprocessor] = None, embedding: typing.Optional[‘mz.Embedding’] = None) → typing.Tuple[BaseModel, BasePreprocessor, DatasetBuilder, DataLoaderBuilder]¶ Prepare.
- Parameters
model_class – Model class.
data_pack – DataPack used to fit the preprocessor.
callback – Callback used to padding a batch. (default: the default callback of model_class)
preprocessor – Preprocessor used to fit the data_pack. (default: the default preprocessor of model_class)
- Returns
A tuple of (model, preprocessor, dataset_builder, dataloader_builder).
-
_build_model
(self, model_class, preprocessor, embedding) → typing.Tuple[BaseModel, np.ndarray]¶
-
_build_matrix
(self, preprocessor, embedding)¶
-
_build_dataset_builder
(self, model, embedding_matrix, preprocessor)¶
-
_build_dataloader_builder
(self, model, callback)¶
-
_infer_num_neg
(self)¶
-
classmethod
get_default_config
(cls) → dict¶ Default config getter.
-
matchzoo.auto.preparer.
prepare
(task: BaseTask, model_class: typing.Type[BaseModel], data_pack: mz.DataPack, callback: typing.Optional[BaseCallback] = None, preprocessor: typing.Optional[BasePreprocessor] = None, embedding: typing.Optional[‘mz.Embedding’] = None, config: typing.Optional[dict] = None)¶ A simple shorthand for using
matchzoo.Preparer
.config is used to control specific behaviors. The default config will be updated accordingly if a config dictionary is passed. e.g. to override the default bin_size, pass config={‘bin_size’: 15}.
- Parameters
task – Task.
model_class – Model class.
data_pack – DataPack used to fit the preprocessor.
callback – Callback used to padding a batch. (default: the default callback of model_class)
preprocessor – Preprocessor used to fit the data_pack. (default: the default preprocessor of model_class)
embedding – Embedding to build a embedding matrix. If not set, then a correctly shaped randomized matrix will be built.
config – Configuration of specific behaviors. (default: return value of mz.Preparer.get_default_config())
- Returns
A tuple of (model, preprocessor, data_generator_builder, embedding_matrix).