matchzoo.preprocessors.units.tokenize

Module Contents

class matchzoo.preprocessors.units.tokenize.Tokenize

Bases: matchzoo.preprocessors.units.unit.Unit

Process unit for text tokenization.

transform(self, input_:str)

Process input data from raw terms to list of tokens.

Parameters:input – raw textual input.
Return tokens:tokenized tokens as a list.