matchzoo.dataloader.callbacks

Package Contents

class matchzoo.dataloader.callbacks.LambdaCallback(on_batch_data_pack=None, on_batch_unpacked=None)

Bases: matchzoo.engine.base_callback.BaseCallback

LambdaCallback. Just a shorthand for creating a callback class.

See matchzoo.engine.base_callback.BaseCallback for more details.

Example

>>> import matchzoo as mz
>>> from matchzoo.dataloader.callbacks import LambdaCallback
>>> data = mz.datasets.toy.load_data()
>>> batch_func = lambda x: print(type(x))
>>> unpack_func = lambda x, y: print(type(x), type(y))
>>> callback = LambdaCallback(on_batch_data_pack=batch_func,
...                           on_batch_unpacked=unpack_func)
>>> dataset = mz.dataloader.Dataset(
...     data, callbacks=[callback])
>>> _ = dataset[0]
<class 'matchzoo.data_pack.data_pack.DataPack'>
<class 'dict'> <class 'numpy.ndarray'>
on_batch_data_pack(self, data_pack)

on_batch_data_pack.

on_batch_unpacked(self, x, y)

on_batch_unpacked.

class matchzoo.dataloader.callbacks.DynamicPooling(fixed_length_left:int, fixed_length_right:int, compress_ratio_left:float=1, compress_ratio_right:float=1)

Bases: matchzoo.engine.base_callback.BaseCallback

DPoolPairDataGenerator constructor.

Parameters:
  • fixed_length_left – max length of left text.
  • fixed_length_right – max length of right text.
  • compress_ratio_left – the length change ratio, especially after normal pooling layers.
  • compress_ratio_right – the length change ratio, especially after normal pooling layers.
on_batch_unpacked(self, x, y)

Insert dpool_index into x.

Parameters:
  • x – unpacked x.
  • y – unpacked y.
class matchzoo.dataloader.callbacks.Histogram(embedding_matrix:np.ndarray, bin_size:int=30, hist_mode:str='CH')

Bases: matchzoo.engine.base_callback.BaseCallback

Generate data with matching histogram.

Parameters:
  • embedding_matrix – The embedding matrix used to generator match histogram.
  • bin_size – The number of bin size of the histogram.
  • hist_mode – The mode of the MatchingHistogramUnit, one of CH, NH, and LCH.
on_batch_unpacked(self, x, y)

Insert match_histogram to x.

class matchzoo.dataloader.callbacks.BasicPadding(fixed_length_left:int=None, fixed_length_right:int=None, pad_value:typing.Union[int, str]=0, pad_mode:str='pre')

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for basic preprocessor.

Parameters:
  • fixed_length_left – Integer. If set, text_left will be padded to this length.
  • fixed_length_right – Integer. If set, text_right will be padded to this length.
  • pad_value – the value to fill text.
  • pad_mode – String, pre or post: pad either before or after each sequence.
on_batch_unpacked(self, x:dict, y:np.ndarray)

Pad x[‘text_left’] and x[‘text_right].

class matchzoo.dataloader.callbacks.DRMMPadding(fixed_length_left:int=None, fixed_length_right:int=None, pad_value:typing.Union[int, str]=0, pad_mode:str='pre')

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for DRMM Model.

Parameters:
  • fixed_length_left – Integer. If set, text_left and match_histogram will be padded to this length.
  • fixed_length_right – Integer. If set, text_right will be padded to this length.
  • pad_value – the value to fill text.
  • pad_mode – String, pre or post: pad either before or after each sequence.
on_batch_unpacked(self, x:dict, y:np.ndarray)

Padding.

Pad x[‘text_left’], x[‘text_right] and x[‘match_histogram’].

class matchzoo.dataloader.callbacks.CDSSMPadding(fixed_length_left:int=None, fixed_length_right:int=None, pad_value:typing.Union[int, str]=0, pad_mode:str='pre')

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for cdssm preprocessor.

Parameters:
  • fixed_length_left – Integer. If set, text_left will be padded to this length.
  • fixed_length_right – Integer. If set, text_right will be padded to this length.
  • pad_value – the value to fill text.
  • pad_mode – String, pre or post: pad either before or after each sequence.
on_batch_unpacked(self, x:dict, y:np.ndarray)

Pad x[‘text_left’] and x[‘text_right].

class matchzoo.dataloader.callbacks.DIINPadding(fixed_length_left:int=None, fixed_length_right:int=None, fixed_length_word:int=None, pad_value:typing.Union[int, str]=0, pad_mode:str='pre')

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for diin preprocessor.

Parameters:
  • fixed_length_left – Integer. If set, text_left, char_left and match_left will be padded to this length.
  • fixed_length_right – Integer. If set, text_right, char_right and match_right will be padded to this length.
  • fixed_length_word – Integer. If set, words in char_left and char_right will be padded to this length.
  • pad_value – the value to fill text.
  • pad_mode – String, pre or post: pad either before or after each sequence.
on_batch_unpacked(self, x:dict, y:np.ndarray)

Padding.

Pad x[‘text_left’], x[‘text_right],
x[‘char_left’], x[‘char_right], x[‘match_left’], x[‘match_right].
class matchzoo.dataloader.callbacks.BertPadding(fixed_length_left:int=None, fixed_length_right:int=None, pad_value:typing.Union[int, str]=0, pad_mode:str='pre')

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for bert preprocessor.

Parameters:
  • fixed_length_left – Integer. If set, text_left will be padded to this length.
  • fixed_length_right – Integer. If set, text_right will be padded to this length.
  • pad_value – the value to fill text.
  • pad_mode – String, pre or post: pad either before or after each sequence.
on_batch_unpacked(self, x:dict, y:np.ndarray)

Pad x[‘text_left’] and x[‘text_right].