Data generation

First, we generate a dataset with the peerannot simulate command. This dataset has 30 workers, 200 tasks for 5 classes. Each task receives 10 votes.

[1]:
from pathlib import Path
path = (Path() / ".." / "_build" / "notebooks")
path.mkdir(exist_ok=True, parents=True)

! peerannot simulate --n-worker=30 --n-task=200  --n-classes=5 \
                     --strategy independent-confusion \
                     --feedback=10 --seed 0 \
                     --folder ../_build/notebooks/

    Saved answers at ../_build/notebooks/answers.json

    Saved ground truth at ../_build/notebooks/ground_truth.npy

We can visualize the generated votes and the true labels of the tasks. For example let us consider task 5:

[2]:
import json
import numpy as np
import matplotlib.pyplot as plt

with open(path / "answers.json") as f:
    answers = json.load(f)
gt = np.load(path / "ground_truth.npy")

print("Task 5:", answers["5"])
print("Number of votes:", len(answers["5"]))
print("Ground truth:", gt[5])
fig, ax = plt.subplots()

counts = np.bincount(list(answers["5"].values()), minlength=5)
classes = [f"class {str(i)}" for i in [0, 1, 2, 3, 4]]

ax.bar(classes, counts)
plt.yticks(range(0, max(counts)+1))
ax.set_ylabel("Number of votes")
ax.set_title("Number of votes for each class for task 5")
plt.tight_layout()
plt.show()
Task 5: {'2': 1, '3': 0, '4': 0, '5': 3, '9': 0, '11': 1, '15': 2, '17': 4, '21': 0, '26': 2}
Number of votes: 10
Ground truth: 0
../_images/tutorials_run_aggregation_notebook_3_1.png

Command Line Aggregation

Let us run some aggregation methods on the dataset we just generated using the command line interface.

[3]:
for strat in ["MV", "NaiveSoft", "DS", "GLAD", "DSWC[L=5]", "Wawa"]:
    ! peerannot aggregate ../_build/notebooks/ -s {strat}
Running aggregation mv with options {}
Aggregated labels stored at /home/circleci/project/doc/_build/notebooks/labels/labels_independent-confusion_mv.npy with shape (200,)
Running aggregation naivesoft with options {}
Aggregated labels stored at /home/circleci/project/doc/_build/notebooks/labels/labels_independent-confusion_naivesoft.npy with shape (200, 5)
Running aggregation ds with options {}
Finished:  40%|████████████▊                   | 20/50 [00:00<00:00, 122.65it/s]
Aggregated labels stored at /home/circleci/project/doc/_build/notebooks/labels/labels_independent-confusion_ds.npy with shape (200, 5)
Running aggregation glad with options {}
- Running EM
Finished:  74%|████████████████████████▍        | 37/50 [00:32<00:11,  1.15it/s]
Task difficulty coefficients saved at /home/circleci/project/doc/_build/notebooks/identification/glad/difficulties.npy
Worker ability coefficients saved at /home/circleci/project/doc/_build/notebooks/identification/glad/abilities.npy
Aggregated labels stored at /home/circleci/project/doc/_build/notebooks/labels/labels_independent-confusion_glad.npy with shape (200, 5)
Running aggregation dswc with options {'L': 5}
Aggregated labels stored at /home/circleci/project/doc/_build/notebooks/labels/labels_independent-confusion_dswc[l=5].npy with shape (200, 5)
Running aggregation wawa with options {}
/home/circleci/project/peerannot/models/aggregation/Wawa.py:59: UserWarning:
            Wawa aggregation only returns hard labels.
            Defaulting to ``get_answers()``.

  warnings.warn(
Aggregated labels stored at /home/circleci/project/doc/_build/notebooks/labels/labels_independent-confusion_wawa.npy with shape (200,)

Now, as we know the ground truth we can evaluate the performance of the aggregation methods. In this example we consider the accuracy. Other metrics such as F1-scores, precision, recall, etc. can be used.

[4]:
import pandas as pd

def accuracy(labels, gt):
    return np.mean(labels == gt) if labels.ndim == 1 else np.mean(np.argmax(labels, axis=1) == gt)

results = {  # initialize results dictionary
    "mv": [],
    "naivesoft": [],
    "glad": [],
    "ds": [],
    "wawa": [],
    "dswc[l=5]": [],
}
for strategy in results.keys():
    path_labels = path / "labels" / f"labels_independent-confusion_{strategy}.npy"
    labels = np.load(path_labels)  # load aggregated labels
    results[strategy].append(accuracy(labels, gt))  # compute accuracy
results["NS"] = results["naivesoft"]  # rename naivesoft to NS
results.pop("naivesoft")

# Styling the results
results = pd.DataFrame(results, index=["AccTrain"])
results.columns = map(str.upper, results.columns)
results = results.style.set_table_styles(
    [dict(selector="th", props=[("text-align", "center")])]
)
results.set_properties(**{"text-align": "center"})
results = results.format(precision=3)
results
[4]:
  MV GLAD DS WAWA DSWC[L=5] NS
AccTrain 0.750 0.780 0.890 0.790 0.775 0.760

API Aggregation

We showed how to use the command line interface, but what about the API? It’s just as simple!

[5]:
from peerannot.models import agg_strategies

strategies = ["MV", "GLAD", "DS", "NaiveSoft", "DSWC", "Wawa"]
yhats = []
for strat in strategies:
    agg = agg_strategies[strat]
    if strat != "DSWC":
        agg = agg(answers, n_classes=5, n_workers=30, n_tasks=200, dataset=path)
    else:
        agg = agg(answers, L=5, n_classes=5, n_workers=30, n_tasks=200)
    if hasattr(agg, "run"):
        agg.run(maxiter=20)
    yhats.append(agg.get_answers())
/home/circleci/miniconda/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
- Running EM
Finished: : 21it [00:20,  1.05it/s]
Task difficulty coefficients saved at /home/circleci/project/doc/_build/notebooks/identification/glad/difficulties.npy
Worker ability coefficients saved at /home/circleci/project/doc/_build/notebooks/identification/glad/abilities.npy
Finished: 100% 20/20 [00:00<00:00, 124.01it/s]
[6]:
results = {  # initialize results dictionary
    "mv": [],
    "glad": [],
    "ds": [],
    "naivesoft": [],
    "dswc[l=5]": [],
    "wawa": [],
}
for i, strategy in enumerate(results.keys()):
    labels = yhats[i] # load aggregated labels
    results[strategy].append(accuracy(labels, gt))  # compute accuracy
results["NS"] = results["naivesoft"]  # rename naivesoft to NS
results.pop("naivesoft")

# Styling the results
results = pd.DataFrame(results, index=["AccTrain"])
results.columns = map(str.upper, results.columns)
results = results.style.set_table_styles(
    [dict(selector="th", props=[("text-align", "center")])]
)
results.set_properties(**{"text-align": "center"})
results = results.format(precision=3)
results
[6]:
  MV GLAD DS DSWC[L=5] WAWA NS
AccTrain 0.775 0.775 0.890 0.775 0.790 0.760

The difference in performance shown result from the random tie-breaks generated.