GLAD¶
- class GLAD(answers, n_classes, **kwargs)¶
GLAD (Whitehill et. al 2009)¶
Each worker ability is modeled using one scalar. Each task has a difficulty level represented as a positive scalar. Knowing these coefficients, the probability to have the right answer is a sigmoid of their product.
Assumption: - The errors are uniform over classes
Using: - One scalar per task and worker (task difficulty and worker ability)
- __init__(answers, n_classes, **kwargs)¶
The probability of a worker to give the right answer is a sigmoid (denoted \(\mathrm{sig}\)) of the product of the worker ability \(\alpha_j\) and the task difficulty \(\beta_i\). Given a label \(k\in [K]\),
\[\mathbb{P}(y_i^{(j)}=k |y_i^\star=k, \alpha_j,\beta_i) = \mathrm{sig}(\alpha_j\beta_i) = \frac{1}{1+e^{-\alpha_j\beta_i}} \enspace.\]And the following likelihood is maximized:
\[\prod_{i\in[n_\text{task}]} \prod_{k\in[K]}\mathbb{P}(y_i^\star=k)\prod_{j\in [n_\text{worker}]} \left(\frac{1}{K-1}\left(1-\mathrm{sig}(\alpha_j\beta_i)\right)\right)^{1-\mathbf{1}_{\{y_i^{(j)}=k\}}}\mathrm{sig}(\alpha_j\beta_i)^{\mathbf{1}_{\{y_i^{(j)}=k\}}} \enspace.\]
- EM(epsilon, maxiter)¶
Infer true labels, tasks’ difficulty and workers’ ability
- calcLogProbL(item, *args)¶
Compute the log probability of a label given the task and worker parameters
- EStep()¶
Evaluate the posterior probability of true labels given observed labels and parameters
- packX()¶
- unpackX(x)¶
- getBoundsX(alpha=(-100, 100), beta=(-100, 100))¶
- f(x)¶
Return the value of the objective function
- df(x)¶
Return gradient vector
- MStep()¶
Maximization step
- computeQ()¶
Calculate the expectation of the joint likelihood
- dAlpha(item, *args)¶
Compute the derivative of the objective function with respect to the worker ability
- dBeta(item, *args)¶
Compute the derivative of the objective function with respect to the task difficulty
- gradientQ()¶
- run(epsilon=1e-05, maxiter=50)¶
Run the label aggregation via EM algorithm
- get_probas()¶
Get soft labels distribution for each task
- Returns:
Soft labels
- Return type:
numpy.ndarray(n_task, n_classes)
- get_answers()¶
Argmax of soft labels.
- Returns:
Hard labels
- Return type: