PlantNet¶

class PlantNet(answers, n_classes, AI='ignored', parrots='ignored', alpha=1, beta=1, AIweight=1, authors=None, scores=None, threshold_scores=None, **kwargs)¶

PlantNet aggregation strategy¶

Weighted majority vote based on the number of identified classes (species) per worker. Each task if either valid (\(s_i=1\) or not) if the confidence and accuracy in the estimated label are above the set thresholds.

__init__(answers, n_classes, AI='ignored', parrots='ignored', alpha=1, beta=1, AIweight=1, authors=None, scores=None, threshold_scores=None, **kwargs)¶

Compute a weighted majority vote based on the number of identified classes (species) per worker

Parameters:

answers (dict) –

Dictionary of workers answers with format

{
    task0: {worker0: label, worker1: label},
    task1: {worker1: label}
}

n_classes (int) – Number of possible classes (should be high)
AI (str, optional) –
How to consider entries with worker=AI in the dictionnary of answers, defaults to “ignored”. Several options are available:
- ignored: ignore the AI labels
- worker: consider the AI as a worker
- fixed: consider the AI as a worker with a fixed weight=`AIweight`
- invalidating: consider the AI as a worker with a weight=`AIweight` that can only invalidate the tasks
- confident: consider the AI as a worker with a weight=`AIweight` if the predicted score is above the threshold threshold_scores
parrots (str, optional) – How to deal with parrot answers, defaults to “ignored” (not implemented yet)
alpha (float, optional) – Value of \(\alpha\) parameter in weight function, defaults to 1
beta (float, optional) – Value of \(\beta\) parameter in weight function, defaults to 1
AIweight (float, optional) – Weight of the AI if not ignored, defaults to 1
authors (str, optional) – Path to txt file containing authors id for each task
scores (str, optional) – Path to json file containing AI prediction scores for each task
threshold_scores (float between 0 and 1, optional) – Threshold for AI prediction scores if AI strategy is set to confident

get_wmv(weights)¶

Compute weighted majority vote

Parameters:: weights (np.ndarray of size n_workers) – Weights of each worker
Returns:: Most weighted labels
Return type:: np.ndarray of size n_task

get_conf_acc(yhat, weights)¶

Compute confidence and accuracy scores for each task

Parameters:

yhat (np.ndarray of size n_task) – Estimated labels
weights (np.ndarray of size n_workers) – Weights of each worker

get_valid_tasks(acc, conf)¶: Compute mask for valid observations (\(s_i=1\)):

\[s_i=1 \text{ if } \mathrm{conf}_i > \theta_{\text{conf}} \text{ and } \mathrm{acc}_i > \theta_{\text{acc}}\]

get_weights()¶

Compute weight transformation

Returns:

Weight of each worker:

\[w_j = \alpha^{n_j} - \beta^{n_j} + \log(2.1)\]

Return type:

np.ndarray of size n_workers

get_n(valid, yhat)¶

Compute the number of identified classes

Parameters:

valid (np.ndarray of size n_task) – Indicator of valid tasks
yhat (np.ndarray of size n_task) – Estimated labels

run(maxiter=100, epsilon=1e-05)¶

Run the PlantNet aggregation algorithm

Parameters:

maxiter (int, optional) – Maximum number of iterations in the EM, defaults to 100 (at least 5)
epsilon (float, optional) – Stopping criterion if weights are not updated anymore, defaults to 1e-5

get_answers()¶

Returns:: Hard labels and None when no consensus is reached
Return type:: numpy.ndarray

get_probas()¶: Not available for this strategy, default to get_answers()