matchzoo.auto.preparer
¶
Package Contents¶
-
class
matchzoo.auto.preparer.
Preparer
(task:BaseTask, config:typing.Optional[dict]=None)¶ Bases:
object
Unified setup processes of all MatchZoo models.
config is used to control specific behaviors. The default config will be updated accordingly if a config dictionary is passed. e.g. to override the default bin_size, pass config={‘bin_size’: 15}.
See tutorials/automation.ipynb for a detailed walkthrough on usage.
Default config:
- {
# pair generator builder kwargs ‘num_dup’: 1,
# histogram unit of DRMM ‘bin_size’: 30, ‘hist_mode’: ‘LCH’,
# dynamic Pooling of MatchPyramid ‘compress_ratio_left’: 1.0, ‘compress_ratio_right’: 1.0,
# if no matchzoo.Embedding is passed to tune ‘embedding_output_dim’: 50
}
Parameters: - task – Task.
- config – Configuration of specific behaviors.
Example
>>> import matchzoo as mz >>> task = mz.tasks.Ranking(losses=mz.losses.RankCrossEntropyLoss()) >>> preparer = mz.auto.Preparer(task) >>> model_class = mz.models.DenseBaseline >>> train_raw = mz.datasets.toy.load_data('train', 'ranking') >>> model, prpr, dsb, dlb = preparer.prepare(model_class, ... train_raw) >>> model.params.completed() True
-
prepare
(self, model_class:typing.Type[BaseModel], data_pack:mz.DataPack, callback:typing.Optional[BaseCallback]=None, preprocessor:typing.Optional[BasePreprocessor]=None, embedding:typing.Optional['mz.Embedding']=None)¶ Prepare.
Parameters: - model_class – Model class.
- data_pack – DataPack used to fit the preprocessor.
- callback – Callback used to padding a batch. (default: the default callback of model_class)
- preprocessor – Preprocessor used to fit the data_pack. (default: the default preprocessor of model_class)
Returns: A tuple of (model, preprocessor, dataset_builder, dataloader_builder).
-
_build_model
(self, model_class, preprocessor, embedding)¶
-
_build_matrix
(self, preprocessor, embedding)¶
-
_build_dataset_builder
(self, model, embedding_matrix, preprocessor)¶
-
_build_dataloader_builder
(self, model, callback)¶
-
_infer_num_neg
(self)¶
-
classmethod
get_default_config
(cls)¶ Default config getter.
-
matchzoo.auto.preparer.
prepare
(task:BaseTask, model_class:typing.Type[BaseModel], data_pack:mz.DataPack, callback:typing.Optional[BaseCallback]=None, preprocessor:typing.Optional[BasePreprocessor]=None, embedding:typing.Optional['mz.Embedding']=None, config:typing.Optional[dict]=None)¶ A simple shorthand for using
matchzoo.Preparer
.config is used to control specific behaviors. The default config will be updated accordingly if a config dictionary is passed. e.g. to override the default bin_size, pass config={‘bin_size’: 15}.
Parameters: - task – Task.
- model_class – Model class.
- data_pack – DataPack used to fit the preprocessor.
- callback – Callback used to padding a batch. (default: the default callback of model_class)
- preprocessor – Preprocessor used to fit the data_pack. (default: the default preprocessor of model_class)
- embedding – Embedding to build a embedding matrix. If not set, then a correctly shaped randomized matrix will be built.
- config – Configuration of specific behaviors. (default: return value of mz.Preparer.get_default_config())
Returns: A tuple of (model, preprocessor, data_generator_builder, embedding_matrix).