nessie.detectors.projection_ensemble
Module Contents
Classes
Identifying Incorrect Labels in the CoNLL-2003 Corpus |
Functions
|
Context manager to patch joblib to report into tqdm progress bar given as argument |
- class nessie.detectors.projection_ensemble.MaxEntProjectionEnsemble(n_components: List[int] = None, seeds: List[int] = None, num_jobs: int = 4, max_iter: int = 10000)
Bases:
nessie.detectors.error_detector.Detector
Identifying Incorrect Labels in the CoNLL-2003 Corpus Frederick Reiss, Hong Xu, Bryan Cutler, Karthik Muthuraman, Zachary Eichenberger Proceedings of the 24th Conference on Computational Natural Language Learning - 2020 https://aclanthology.org/2020.conll-1.16/
- property ensemble_size(self) int
- error_detector_kind(self) nessie.detectors.error_detector.DetectorKind
- score(self, X_train_embedded: nessie.types.FloatArray2D, y_train_encoded: numpy.typing.NDArray[int], X_eval_embedded: nessie.types.FloatArray2D, y_eval_encoded: numpy.typing.NDArray[int], **kwargs) Tuple[List[str], List[List[str]], numpy.typing.NDArray[bool]]
Uses an ensemble of logistic regression models that use different Gaussian projections of dense embeddings as input. These are aggregated via majority vote and instances are flagged whose label disagree.
- Parameters
X_train_embedded – shape (n_instances, encoding_dim)
y_train_encoded – shape (n_instances)
X_eval_embedded – shape (n_instances, encoding_dim)
y_eval_encoded – shape (n_instances)
- Returns
A string list of the predictions for every instance after majority vote ensemble:predictions: A list of string lists containing the predictions for every instance before majority vote flags: A boolean sequence containing the flags
- Return type
predictions
- nessie.detectors.projection_ensemble.tqdm_joblib(tqdm_object)
Context manager to patch joblib to report into tqdm progress bar given as argument