matchzoo.data_pack.pack

Convert list of input into class:DataPack expected format.

Module Contents

matchzoo.data_pack.pack.pack(df: pd.DataFrame, task: typing.Union[str, BaseTask] = 'ranking') → 'matchzoo.DataPack'

Pack a DataPack using df.

The df must have text_left and text_right columns. Optionally, the df can have id_left, id_right to index text_left and text_right respectively. id_left, id_right will be automatically generated if not specified.

Parameters:
  • df – Input pandas.DataFrame to use.
  • task – Could be one of ranking, classification or a matchzoo.engine.BaseTask instance.
Examples::
>>> import matchzoo as mz
>>> import pandas as pd
>>> df = pd.DataFrame(data={'text_left': list('AABC'),
...                         'text_right': list('abbc'),
...                         'label': [0, 1, 1, 0]})
>>> mz.pack(df, task='classification').frame()
  id_left text_left id_right text_right  label
0     L-0         A      R-0          a      0
1     L-0         A      R-1          b      1
2     L-1         B      R-1          b      1
3     L-2         C      R-2          c      0
>>> mz.pack(df, task='ranking').frame()
  id_left text_left id_right text_right  label
0     L-0         A      R-0          a    0.0
1     L-0         A      R-1          b    1.0
2     L-1         B      R-1          b    1.0
3     L-2         C      R-2          c    0.0
matchzoo.data_pack.pack._merge(data: pd.DataFrame, ids: typing.Union[list, np.array], text_label: str, id_label: str)
matchzoo.data_pack.pack._gen_ids(data: pd.DataFrame, col: str, prefix: str)