matchzoo.dataloader.callbacks.padding

Module Contents

Classes

BasicPadding

Pad data for basic preprocessor.

DRMMPadding

Pad data for DRMM Model.

BertPadding

Pad data for bert preprocessor.

Functions

_infer_dtype(value)

Infer the dtype for the features.

_padding_2D(input, output, mode: str = ‘pre’)

Pad the input 2D-tensor to the output 2D-tensor.

_padding_3D(input, output, mode: str = ‘pre’)

Pad the input 3D-tensor to the output 3D-tensor.

matchzoo.dataloader.callbacks.padding._infer_dtype(value)

Infer the dtype for the features.

It is required as the input is usually array of objects before padding.

matchzoo.dataloader.callbacks.padding._padding_2D(input, output, mode: str = 'pre')

Pad the input 2D-tensor to the output 2D-tensor.

Parameters
  • input – The input 2D-tensor contains the origin values.

  • output – The output is a shapped 2D-tensor which have filled with pad value.

  • mode – The padding model, which can be ‘pre’ or ‘post’.

matchzoo.dataloader.callbacks.padding._padding_3D(input, output, mode: str = 'pre')

Pad the input 3D-tensor to the output 3D-tensor.

Parameters
  • input – The input 3D-tensor contains the origin values.

  • output – The output is a shapped 3D-tensor which have filled with pad value.

  • mode – The padding model, which can be ‘pre’ or ‘post’.

class matchzoo.dataloader.callbacks.padding.BasicPadding(fixed_length_left: int = None, fixed_length_right: int = None, pad_word_value: typing.Union[int, str] = 0, pad_word_mode: str = 'pre', with_ngram: bool = False, fixed_ngram_length: int = None, pad_ngram_value: typing.Union[int, str] = 0, pad_ngram_mode: str = 'pre')

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for basic preprocessor.

Parameters
  • fixed_length_left – Integer. If set, text_left will be padded to this length.

  • fixed_length_right – Integer. If set, text_right will be padded to this length.

  • pad_word_value – the value to fill text.

  • pad_word_mode – String, pre or post: pad either before or after each sequence.

  • with_ngram – Boolean. Whether to pad the n-grams.

  • fixed_ngram_length – Integer. If set, each word will be padded to this length, or it will be set as the maximum length of words in current batch.

  • pad_ngram_value – the value to fill empty n-grams.

  • pad_ngram_mode – String, pre or post: pad either before of after each sequence.

on_batch_unpacked(self, x: dict, y: np.ndarray)

Pad x[‘text_left’] and x[‘text_right].

class matchzoo.dataloader.callbacks.padding.DRMMPadding(fixed_length_left: int = None, fixed_length_right: int = None, pad_value: typing.Union[int, str] = 0, pad_mode: str = 'pre')

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for DRMM Model.

Parameters
  • fixed_length_left – Integer. If set, text_left and match_histogram will be padded to this length.

  • fixed_length_right – Integer. If set, text_right will be padded to this length.

  • pad_value – the value to fill text.

  • pad_mode – String, pre or post: pad either before or after each sequence.

on_batch_unpacked(self, x: dict, y: np.ndarray)

Padding.

Pad x[‘text_left’], x[‘text_right] and x[‘match_histogram’].

class matchzoo.dataloader.callbacks.padding.BertPadding(fixed_length_left: int = None, fixed_length_right: int = None, pad_value: typing.Union[int, str] = 0, pad_mode: str = 'pre')

Bases: matchzoo.engine.base_callback.BaseCallback

Pad data for bert preprocessor.

Parameters
  • fixed_length_left – Integer. If set, text_left will be padded to this length.

  • fixed_length_right – Integer. If set, text_right will be padded to this length.

  • pad_value – the value to fill text.

  • pad_mode – String, pre or post: pad either before or after each sequence.

on_batch_unpacked(self, x: dict, y: np.ndarray)

Pad x[‘text_left’] and x[‘text_right].