Glossary¶
Name |
Definition |
Mathematical Definition |
---|---|---|
\(n_{task}\) |
The total number of tasks in a dataset |
|
\(n_{worker}\) |
The total number of workers in a dataset |
|
\([K]\) |
The set of labels a task can take |
\([K] = \{1,...,K\}\) |
\(\Delta_K\) |
The simplex of dimension \(K-1\), used to represent soft labels (ie. labels as a probability vector along \([K]\)) |
\(\Delta_K = \{ p \in [K] : \sum_{k=1}^K p_k=1, p_k \geq 0 \}\) |
\(\mathcal{A(x_i)}\) |
The set of workers that answered the task \(i\) |
\(\{j\in[ n_{worker} : w_j \text{ answered } x_i\}\) |
\(\mathcal{T(w_j)}\) |
The set of tasks answered by the worker \(j\) |
\(\{i\in[ n_{task} : w_j \text{ answered }x_i\}\) |
\(\mathcal{Lab(x_i)}\) |
The vector of answered labels of the task \(i\) |
\((y_i^{(j)})_{j\in\mathcal{A(x_i)}}\) |
\(y_i^*\) |
The true label of the task \(i\) |
\(y_i^* \in [K]\) |
\(\hat{y}_i^{agg}\) |
The computed label of the task \(i\) given the aggregation \(agg\) method |
\(\begin{cases}\hat{y}_i^{agg} \in [K] \text{ if a hard label} \\ \hat{y}_i^{agg} \in \Delta_K \text{ if a soft label} \end{cases}\) |
\(y^{(j)}_i\) |
The label (hard) that the worker \(j\) assigned to the task \(i\) |
|
\(\pi^{(j)}\) |
The confusion matrix of the worker \(j\) |
\(\pi^{(j)}_{k,\ell}=\mathbb{P}(y_i^{(j)}=\ell∣y_i^\star=k), \, \forall (\ell,k)\in [K]^2\) |
\(AccTrain(\mathcal{D})\) |
A metric that measure aggregation strategies’ accuracies |
\(AccTrain(\mathcal{D}) = \frac{1}{|\mathcal{D}|} \sum_{i=1}^{|\mathcal{D}|} \mathbf{1}_{\Big\{y_i = \operatorname*{argmax}\limits_{k\in [K]}(ŷ_i)_k\Big\}}\) |