`matchzoo.dataloader.callbacks`¶

Submodules¶

Package Contents¶

Classes¶

`LambdaCallback`	LambdaCallback. Just a shorthand for creating a callback class.
`Histogram`	Generate data with matching histogram.
`Ngram`	Generate the character n-gram for data.
`BasicPadding`	Pad data for basic preprocessor.
`DRMMPadding`	Pad data for DRMM Model.
`BertPadding`	Pad data for bert preprocessor.

class matchzoo.dataloader.callbacks.LambdaCallback(on_batch_data_pack=None, on_batch_unpacked=None)¶

Bases: matchzoo.engine.base_callback.BaseCallback

LambdaCallback. Just a shorthand for creating a callback class.

See matchzoo.engine.base_callback.BaseCallback for more details.

Example

>>> import matchzoo as mz
>>> from matchzoo.dataloader.callbacks import LambdaCallback
>>> data = mz.datasets.toy.load_data()
>>> batch_func = lambda x: print(type(x))
>>> unpack_func = lambda x, y: print(type(x), type(y))
>>> callback = LambdaCallback(on_batch_data_pack=batch_func,
...                           on_batch_unpacked=unpack_func)
>>> dataset = mz.dataloader.Dataset(
...     data, callbacks=[callback])
>>> _ = dataset[0]
<class 'matchzoo.data_pack.data_pack.DataPack'>
<class 'dict'> <class 'numpy.ndarray'>

on_batch_data_pack(self, data_pack)¶: on_batch_data_pack.

on_batch_unpacked(self, x, y)¶: on_batch_unpacked.

class matchzoo.dataloader.callbacks.Histogram(embedding_matrix: np.ndarray, bin_size: int = 30, hist_mode: str = 'CH')¶

Bases: matchzoo.engine.base_callback.BaseCallback

Generate data with matching histogram.

Parameters

embedding_matrix – The embedding matrix used to generator match histogram.
bin_size – The number of bin size of the histogram.
hist_mode – The mode of the MatchingHistogramUnit, one of CH, NH, and LCH.

on_batch_unpacked(self, x, y)¶: Insert match_histogram to x.

class matchzoo.dataloader.callbacks.Ngram(preprocessor: mz.preprocessors.BasicPreprocessor, mode: str = 'index')¶

Bases: matchzoo.engine.base_callback.BaseCallback

Generate the character n-gram for data.

Parameters

preprocessor – The fitted BasePreprocessor object, which contains the n-gram units information.
mode – It can be one of ‘index’, ‘onehot’, ‘sum’ or ‘aggregate’.

Example

>>> import matchzoo as mz
>>> from matchzoo.dataloader.callbacks import Ngram
>>> data = mz.datasets.toy.load_data()
>>> preprocessor = mz.preprocessors.BasicPreprocessor(ngram_size=3)
>>> data = preprocessor.fit_transform(data)
>>> callback = Ngram(preprocessor=preprocessor, mode='index')
>>> dataset = mz.dataloader.Dataset(
...     data, callbacks=[callback])
>>> _ = dataset[0]

on_batch_unpacked(self, x, y)¶: Insert ngram_left and ngram_right to x.

class matchzoo.dataloader.callbacks.BasicPadding(fixed_length_left: int = None, fixed_length_right: int = None, pad_word_value: typing.Union[int, str] = 0, pad_word_mode: str = 'pre', with_ngram: bool = False, fixed_ngram_length: int = None, pad_ngram_value: typing.Union[int, str] = 0, pad_ngram_mode: str = 'pre')¶

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for basic preprocessor.

Parameters

fixed_length_left – Integer. If set, text_left will be padded to this length.
fixed_length_right – Integer. If set, text_right will be padded to this length.
pad_word_value – the value to fill text.
pad_word_mode – String, pre or post: pad either before or after each sequence.
with_ngram – Boolean. Whether to pad the n-grams.
fixed_ngram_length – Integer. If set, each word will be padded to this length, or it will be set as the maximum length of words in current batch.
pad_ngram_value – the value to fill empty n-grams.
pad_ngram_mode – String, pre or post: pad either before of after each sequence.

on_batch_unpacked(self, x: dict, y: np.ndarray)¶: Pad x[‘text_left’] and x[‘text_right].

class matchzoo.dataloader.callbacks.DRMMPadding(fixed_length_left: int = None, fixed_length_right: int = None, pad_value: typing.Union[int, str] = 0, pad_mode: str = 'pre')¶

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for DRMM Model.

Parameters

fixed_length_left – Integer. If set, text_left and match_histogram will be padded to this length.
fixed_length_right – Integer. If set, text_right will be padded to this length.
pad_value – the value to fill text.
pad_mode – String, pre or post: pad either before or after each sequence.

on_batch_unpacked(self, x: dict, y: np.ndarray)¶

Padding.

Pad x[‘text_left’], x[‘text_right] and x[‘match_histogram’].

class matchzoo.dataloader.callbacks.BertPadding(fixed_length_left: int = None, fixed_length_right: int = None, pad_value: typing.Union[int, str] = 0, pad_mode: str = 'pre')¶

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for bert preprocessor.

Parameters

fixed_length_left – Integer. If set, text_left will be padded to this length.
fixed_length_right – Integer. If set, text_right will be padded to this length.
pad_value – the value to fill text.
pad_mode – String, pre or post: pad either before or after each sequence.

on_batch_unpacked(self, x: dict, y: np.ndarray)¶: Pad x[‘text_left’] and x[‘text_right].

matchzoo.dataloader.callbacks¶

Submodules¶

Package Contents¶

Classes¶

`matchzoo.dataloader.callbacks`¶