matchzoo.preprocessors.units.stop_removal

Module Contents

Classes

StopRemoval

Process unit to remove stop words.

class matchzoo.preprocessors.units.stop_removal.StopRemoval(lang: str = 'english')

Bases: matchzoo.preprocessors.units.unit.Unit

Process unit to remove stop words.

Example

>>> unit = StopRemoval()
>>> unit.transform(['a', 'the', 'test'])
['test']
>>> type(unit.stopwords)
<class 'list'>
transform(self, input_: list) → list

Remove stopwords from list of tokenized tokens.

Parameters
  • input – list of tokenized tokens.

  • lang – language code for stopwords.

Return tokens

list of tokenized tokens without stopwords.

property stopwords(self) → list

Get stopwords based on language.

Params lang

language code.

Returns

list of stop words.