segram.nlp.tokens.doc module

class segram.nlp.tokens.doc.Doc(*args: Any, **kwds: Any)[source]

Bases: NLP

Enhanced document class.

property id: int

Hash id of the document tokenization.

static clear_user_data(user_data: dict)[source]

Clear user data from cached segram objects.

to_data() dict[str, Any][source]

Dump to data dictionary sufficient to recreate simple document without any language model data.

classmethod from_data(data: dict[str, Any]) Self[source]

Construct from data dictionary produced by to_data().