Glossary¶

Name	Definition	Mathematical Definition
\(n_{task}\)	The total number of tasks in a dataset
\(n_{worker}\)	The total number of workers in a dataset
\([K]\)	The set of labels a task can take	\([K] = \{1,...,K\}\)
\(\Delta_K\)	The simplex of dimension \(K-1\), used to represent soft labels (ie. labels as a probability vector along \([K]\))	\(\Delta_K = \{ p \in [K] : \sum_{k=1}^K p_k=1, p_k \geq 0 \}\)
\(\mathcal{A(x_i)}\)	The set of workers that answered the task \(i\)	\(\{j\in[ n_{worker} : w_j \text{ answered } x_i\}\)
\(\mathcal{T(w_j)}\)	The set of tasks answered by the worker \(j\)	\(\{i\in[ n_{task} : w_j \text{ answered }x_i\}\)
\(\mathcal{Lab(x_i)}\)	The vector of answered labels of the task \(i\)	\((y_i^{(j)})_{j\in\mathcal{A(x_i)}}\)
\(y_i^*\)	The true label of the task \(i\)	\(y_i^* \in [K]\)
\(\hat{y}_i^{agg}\)	The computed label of the task \(i\) given the aggregation \(agg\) method	\(\begin{cases}\hat{y}_i^{agg} \in [K] \text{ if a hard label} \\ \hat{y}_i^{agg} \in \Delta_K \text{ if a soft label} \end{cases}\)
\(y^{(j)}_i\)	The label (hard) that the worker \(j\) assigned to the task \(i\)
\(\pi^{(j)}\)	The confusion matrix of the worker \(j\)	\(\pi^{(j)}_{k,\ell}=\mathbb{P}(y_i^{(j)}=\ell∣y_i^\star=k), \, \forall (\ell,k)\in [K]^2\)
\(AccTrain(\mathcal{D})\)	A metric that measure aggregation strategies’ accuracies	\(AccTrain(\mathcal{D}) = \frac{1}{\|\mathcal{D}\|} \sum_{i=1}^{\|\mathcal{D}\|} \mathbf{1}_{\Big\{y_i = \operatorname*{argmax}\limits_{k\in [K]}(ŷ_i)_k\Big\}}\)