Dawid_Skene_clust

class Dawid_Skene_clust(answers, n_classes, L=2, **kwargs)
__init__(answers, n_classes, L=2, **kwargs)

Dawid and Skene model with clusterized confusion matrices using variational inference.

Parameters:
  • answers (dict) –

    Dictionary of workers answers with format

    {
        task0: {worker0: label, worker1: label},
        task1: {worker1: label}
    }
    

  • n_classes (int) – Number of possible classes

  • L (int, optional) – Number of clusters of workers, defaults to 2

get_crowd_matrix()

Compute matrix of size (n_task, n_workers, n_classes) to store proposed votes

initialize_parameter(x, K, L, random=True, delta=1e-10)
variational_update(x, theta, phi, rho, tau, lambda_, delta=1e-10)
hyper_parameter_update(x, theta, phi)
elbo(x, theta, phi, rho, tau, lambda_, delta=1e-10)
convergence_condition(elbo_new, elbo_old, epsilon)
one_iteration(x, K, L, epsilon=0.0001, random=False)
run(epsilon=0.0001, maxiter=100)

Run variational inference for the worker-clusterized DS model

Parameters:
  • epsilon (float, optional) – convergence tolerance between two elbo values, defaults to 1e-4

  • maxiter (int, optional) – Maximum number of iterations, defaults to 100

Returns:

hard labels, (confusion matrices, prevalence), number of iterations

Return type:

tuple( np.ndarray(n_task, n_classes), tuple(

np.ndarray(n_worker, n_task, n_task), np.ndarray(n_classes) ),

int)

get_probas()

Get soft labels distribution for each task

Returns:

Estimated soft labels for each task

Return type:

numpy.ndarray(n_task, n_classes)

get_answers()

Argmax of soft labels

Returns:

Hard labels

Return type:

numpy.ndarray(n_task)