`nessie.detectors.projection_ensemble`

Module Contents

Classes

MaxEntProjectionEnsemble

Identifying Incorrect Labels in the CoNLL-2003 Corpus

Functions

tqdm_joblib(tqdm_object)

Context manager to patch joblib to report into tqdm progress bar given as argument

class nessie.detectors.projection_ensemble.MaxEntProjectionEnsemble(n_components: List[int] = None, seeds: List[int] = None, num_jobs: int = 4, max_iter: int = 10000)

Bases: nessie.detectors.error_detector.Detector

Identifying Incorrect Labels in the CoNLL-2003 Corpus Frederick Reiss, Hong Xu, Bryan Cutler, Karthik Muthuraman, Zachary Eichenberger Proceedings of the 24th Conference on Computational Natural Language Learning - 2020 https://aclanthology.org/2020.conll-1.16/

property ensemble_size(self) → int

error_detector_kind(self) → nessie.detectors.error_detector.DetectorKind

score(self, X_train_embedded: nessie.types.FloatArray2D, y_train_encoded: numpy.typing.NDArray[int], X_eval_embedded: nessie.types.FloatArray2D, y_eval_encoded: numpy.typing.NDArray[int], **kwargs) → Tuple[List[str], List[List[str]], numpy.typing.NDArray[bool]]

Uses an ensemble of logistic regression models that use different Gaussian projections of dense embeddings as input. These are aggregated via majority vote and instances are flagged whose label disagree.

Parameters

X_train_embedded – shape (n_instances, encoding_dim)
y_train_encoded – shape (n_instances)
X_eval_embedded – shape (n_instances, encoding_dim)
y_eval_encoded – shape (n_instances)

Returns

A string list of the predictions for every instance after majority vote ensemble:predictions: A list of string lists containing the predictions for every instance before majority vote flags: A boolean sequence containing the flags

Return type

predictions

nessie.detectors.projection_ensemble.tqdm_joblib(tqdm_object): Context manager to patch joblib to report into tqdm progress bar given as argument

nessie.detectors.projection_ensemble

Module Contents

Classes

Functions

`nessie.detectors.projection_ensemble`