segram.utils.misc module
- segram.utils.misc.cosine_similarity(X: ndarray[tuple[int] | tuple[int, int], floating], Y: ndarray[tuple[int] | tuple[int, int], floating], *, aligned: bool = False, nans_as_zeros: bool = True) float | ndarray[tuple[int, ...], floating][source]
Cosine similarity between two vectors.
When 2D arrays are passed it is assumed that vectors for calculating similarities are arranged in rows.
- Parameters:
X – Vectors or arrays of vectors.
Y – Vectors or arrays of vectors.
aligned – If
TruethenXandYhave to be 2D and of the same shape and row-by-row similarities are calculated.nans_as_zeros – Should NaN values arising from zero vector norm be interpreted as zero similarities.
- segram.utils.misc.stringify(obj: Any, **kwds: Any) str[source]
Convert
objto string.If
objexposesto_str()then it is used with keyword arguments passed in**kwds. Otherwise the plain__repr__()is used.
- segram.utils.misc.ensure_cpu_vectors(vocab: Vocab | Any) None[source]
Ensure that word vectors are stored on CPU.
- Parameters:
vocab – Vocabulary object. If an arbitrary object is passed then an attempt at retrieving
.vocabattribute is made.
- segram.utils.misc.prefer_gpu_vectors(vocab: Vocab | Any, device_id: int | None = None) bool[source]
Store word vectors on GPU if possible.
- Parameters:
object. (Vocabulary) – If an arbitrary object is passed then an attempt at retrieving
.vocabattribute is made.device_id – GPU device id. If
Nonethen the default device is used (typically it is with id0).
- Returns:
Specifies whether the vectors where successfully moved to GPU.
- Return type:
bool