segram.nlp.pipeline.coref module

Segram coreference pipeline component.

class segram.nlp.pipeline.coref.Coref(nlp: Language, name: str, model: Language, components: Sequence[str] | None = None)[source]

Bases: object

Coreference resolution pipeline component based on spacy coref component.

name

Pipe name.

model

Language model for coreference resolution.

__init__(nlp: Language, name: str, model: Language, components: Sequence[str] | None = None) None[source]

Initilization method.

Parameters:
  • nlp – Main language model.

  • model – Name of a coreference language model.

  • components – Names of pipeline component names to include. Use all if None.

Raises:

ValueError – If components are empty but not None.

set_corefs(doc: Doc, cluster: Sequence[int]) None[source]

Set proper coreferences from pronoun tokens to closest non-pronoun neighbors within the cluster.

Notes

Coreferences are stored as token indexes (integers) in _ref custom attribute on tokens.

classmethod from_model(nlp: Language, name: str, model: str, components: Sequence[str] | None = None, **kwds: Any) Self[source]

Initialize from model name.

**kwds are passed to spacy.load().

to_disk(path: str | bytes | PathLike, **kwds: Any) None[source]

Serialize the coreference model to disk.

from_disk(path: str | bytes | PathLike, **kwds: Any) Self[source]

Load from disk.