matchzoo.preprocessors.build_vocab_unit

Module Contents

matchzoo.preprocessors.build_vocab_unit.build_vocab_unit(data_pack:DataPack, mode:str='both', verbose:int=1) → Vocabulary

Build a preprocessor.units.Vocabulary given data_pack.

The data_pack should be preprocessed forehand, and each item in text_left and text_right columns of the data_pack should be a list of tokens.

Parameters:
  • data_pack – The DataPack to build vocabulary upon.
  • mode – One of ‘left’, ‘right’, and ‘both’, to determine the source

data for building the VocabularyUnit. :param verbose: Verbosity. :return: A built vocabulary unit.