PlantNet

class PlantNet(answers, n_classes, AI='ignored', parrots='ignored', alpha=1, beta=1, AIweight=1, authors=None, scores=None, threshold_scores=None, **kwargs)

PlantNet aggregation strategy

Weighted majority vote based on the number of identified classes (species) per worker. Each task if either valid (\(s_i=1\) or not) if the confidence and accuracy in the estimated label are above the set thresholds.

__init__(answers, n_classes, AI='ignored', parrots='ignored', alpha=1, beta=1, AIweight=1, authors=None, scores=None, threshold_scores=None, **kwargs)

Compute a weighted majority vote based on the number of identified classes (species) per worker

Parameters:
  • answers (dict) –

    Dictionary of workers answers with format

    {
        task0: {worker0: label, worker1: label},
        task1: {worker1: label}
    }
    

  • n_classes (int) – Number of possible classes (should be high)

  • AI (str, optional) –

    How to consider entries with worker=AI in the dictionnary of answers, defaults to “ignored”. Several options are available:

    • ignored: ignore the AI labels

    • worker: consider the AI as a worker

    • fixed: consider the AI as a worker with a fixed weight=`AIweight`

    • invalidating: consider the AI as a worker with a weight=`AIweight` that can only invalidate the tasks

    • confident: consider the AI as a worker with a weight=`AIweight` if the predicted score is above the threshold threshold_scores

  • parrots (str, optional) – How to deal with parrot answers, defaults to “ignored” (not implemented yet)

  • alpha (float, optional) – Value of \(\alpha\) parameter in weight function, defaults to 1

  • beta (float, optional) – Value of \(\beta\) parameter in weight function, defaults to 1

  • AIweight (float, optional) – Weight of the AI if not ignored, defaults to 1

  • authors (str, optional) – Path to txt file containing authors id for each task

  • scores (str, optional) – Path to json file containing AI prediction scores for each task

  • threshold_scores (float between 0 and 1, optional) – Threshold for AI prediction scores if AI strategy is set to confident

get_wmv(weights)

Compute weighted majority vote

Parameters:

weights (np.ndarray of size n_workers) – Weights of each worker

Returns:

Most weighted labels

Return type:

np.ndarray of size n_task

get_conf_acc(yhat, weights)

Compute confidence and accuracy scores for each task

Parameters:
  • yhat (np.ndarray of size n_task) – Estimated labels

  • weights (np.ndarray of size n_workers) – Weights of each worker

get_valid_tasks(acc, conf)

Compute mask for valid observations (\(s_i=1\)):

\[s_i=1 \text{ if } \mathrm{conf}_i > \theta_{\text{conf}} \text{ and } \mathrm{acc}_i > \theta_{\text{acc}}\]
get_weights()

Compute weight transformation

Returns:

Weight of each worker:

\[w_j = \alpha^{n_j} - \beta^{n_j} + \log(2.1)\]

Return type:

np.ndarray of size n_workers

get_n(valid, yhat)

Compute the number of identified classes

Parameters:
  • valid (np.ndarray of size n_task) – Indicator of valid tasks

  • yhat (np.ndarray of size n_task) – Estimated labels

run(maxiter=100, epsilon=1e-05)

Run the PlantNet aggregation algorithm

Parameters:
  • maxiter (int, optional) – Maximum number of iterations in the EM, defaults to 100 (at least 5)

  • epsilon (float, optional) – Stopping criterion if weights are not updated anymore, defaults to 1e-5

get_answers()
Returns:

Hard labels and None when no consensus is reached

Return type:

numpy.ndarray

get_probas()

Not available for this strategy, default to get_answers()