Naming bug for `head_arc` and `child_arc` weights/representations

This is for future reference, for anyone working off of this model's edge score outputs. 

There is a semantic naming bug in the dependency decoder that actually switches the `head_arc_representation` and `child_arc_representation` and the weights that compute them.

The code that computes the arc logits in `dependency_decoder.py` is:
```
head_arc_representation = self._dropout(self.head_arc_feedforward(encoded_text))
child_arc_representation = self._dropout(self.child_arc_feedforward(encoded_text))
attended_arcs = self.arc_attention(head_arc_representation, child_arc_representation)
```

This _looks_ like `attended_arcs` (shape `batch_size, sent_len, sent_len`) is a tensor of scores with `attended_arcs[b,i,j]` representing the score for an arc _from i to j_ (`head->child`).  The rest of the code, however, uses this tensor as if it were `child->head`, so the tensors are in effect transposed.  This also implies that the weights,`head_arc_feedforward` and `child_arc_feedforward`, are named backwards.

This causes no performance bugs for udify itself, but if you use the outputs of the model arc scores for something else, you need to transpose these scores to get them to be actually `head->child`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Naming bug for `head_arc` and `child_arc` weights/representations #27

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Naming bug for head_arc and child_arc weights/representations #27

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Naming bug for `head_arc` and `child_arc` weights/representations #27