nessie.detectors.confident_learning

Module Contents

Classes

ConfidentLearning

{Confident Learning} estimates the joint distribution of noisy and true labels.

class nessie.detectors.confident_learning.ConfidentLearning

Bases: nessie.detectors.error_detector.ModelBasedDetector

{Confident Learning} estimates the joint distribution of noisy and true labels.

A threshold is then learnt (the average self-confidence), instances whose computed probability of having the correct label is below the respective threshold are flagged as erroneous.

Curtis G. Northcutt, Lu Jiang, & Isaac L. Chuang (2021). Confident Learning: Estimating Uncertainty in Dataset Labels. Journal of Artificial Intelligence Research (JAIR), 70, 1373–1411. https://github.com/cgnorthcutt/cleanlab

error_detector_kind(self)
score(self, labels: nessie.types.StringArray, probabilities: numpy.typing.NDArray[float], le: sklearn.preprocessing.LabelEncoder, **kwargs) numpy.typing.NDArray[bool]

Flags the input via confident learning.

Parameters
  • labels – a (num_instances, ) string sequence containing the noisy label for each instance

  • probabilities – a (num_instances, num_classes) numpy array obtained from a machine learning model

  • le – the label encoder that allows converting the probabilities back to labels

Returns

a (num_instances,) numpy array of bools containing the flags after using CL

uses_probabilities(self) bool