matchzoo.models.cdssm
¶
An implementation of CDSSM (CLSM) model.
Module Contents¶
Classes¶
CDSSM Model implementation. |
|
Squeeze. |
-
class
matchzoo.models.cdssm.
CDSSM
(params: typing.Optional[ParamTable] = None)¶ Bases:
matchzoo.engine.base_model.BaseModel
CDSSM Model implementation.
Learning Semantic Representations Using Convolutional Neural Networks for Web Search. (2014a) A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval. (2014b)
Examples
>>> import matchzoo as mz >>> model = CDSSM() >>> model.params['task'] = mz.tasks.Ranking() >>> model.params['vocab_size'] = 4 >>> model.params['filters'] = 32 >>> model.params['kernel_size'] = 3 >>> model.params['conv_activation_func'] = 'relu' >>> model.build()
-
classmethod
get_default_params
(cls) → ParamTable¶ - Returns
model default parameters.
-
classmethod
get_default_preprocessor
(cls, truncated_mode: str = 'pre', truncated_length_left: typing.Optional[int] = None, truncated_length_right: typing.Optional[int] = None, filter_mode: str = 'df', filter_low_freq: float = 1, filter_high_freq: float = float('inf'), remove_stop_words: bool = False, ngram_size: typing.Optional[int] = 3) → BasePreprocessor¶ Model default preprocessor.
The preprocessor’s transform should produce a correctly shaped data pack that can be used for training.
- Returns
Default preprocessor.
-
classmethod
get_default_padding_callback
(cls, fixed_length_left: int = None, fixed_length_right: int = None, pad_word_value: typing.Union[int, str] = 0, pad_word_mode: str = 'pre', with_ngram: bool = True, fixed_ngram_length: int = None, pad_ngram_value: typing.Union[int, str] = 0, pad_ngram_mode: str = 'pre') → BaseCallback¶ Model default padding callback.
The padding callback’s on_batch_unpacked would pad a batch of data to a fixed length.
- Returns
Default padding callback.
-
_create_base_network
(self) → nn.Module¶ Apply conv and maxpooling operation towards to each letter-ngram.
The input shape is fixed_text_length`*`number of letter-ngram, as described in the paper, n is 3, number of letter-trigram is about 30,000 according to their observation.
- Returns
A
nn.Module
of CDSSM network, tensor in tensor out.
-
build
(self)¶ Build model structure.
CDSSM use Siamese architecture.
-
forward
(self, inputs)¶ Forward.
-
guess_and_fill_missing_params
(self, verbose: int = 1)¶ Guess and fill missing parameters in
params
.Use this method to automatically fill-in hyper parameters. This involves some guessing so the parameter it fills could be wrong. For example, the default task is Ranking, and if we do not set it to Classification manually for data packs prepared for classification, then the shape of the model output and the data will mismatch.
- Parameters
verbose – Verbosity.
-
classmethod