Skip to content

Already labeled data is labeled again in every iteration in "Growing Datasets" #9

@Alx-Wo

Description

@Alx-Wo

Hi, I'm just doing this course out of personal interest. In

def passive_selection(x, labeled, label_func, n):
    candidates = set(range(0, len(x))) - set(labeled)
    labeled = np.concatenate([labeled, random.sample(list(candidates), n)])
    labels = [label_func(example) for example in x[labeled]]
    return labeled, labels

and

def active_selection(x, labeled, label_func, n):
    labels = [label_func(example) for example in x[labeled]]
    candidates = set(np.arange(len(x))) - set(labeled)
    # YOUR CODE HERE
    pass

both functions apply the label_func to all samples in labeled, so already labeled data will always be rel-labeled. Is there a reason for re-labeling already labeled examples in every iteration. It does not really matter since the label_func is O(1) but in practice this would be very bad I assume?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions