`string_match_all` only evaluates recall but not precision

I may be missing something, but it appears that `string_match_all` is only evaluating *recall*, but not *precision*. 
So in the multi-key setting, a strategy for achieving a perfect score is to just output all values all the time.

For example, the following returns a score of 100%. 
```python 
def string_match_all(preds, refs):
    score = sum([sum([1.0 if r.lower() in pred.lower() else 0.0 for r in ref]) / len(ref) for pred, ref in zip(preds, refs)]) / len(preds) * 100
    return round(score, 2)

preds =[
    "a b c d e f g h i j k l m n o p q r s t u v w x y z",
    "a b c d e f g h i j k l m n o p q r s t u v w x y z",
    "a b c d e f g h i j k l m n o p q r s t u v w x y z",
]
refs = [
    ["a", "b", "c"],
    ["x", "y", "z"],
    ["m", "n", "o"],
]

string_match_all(preds, refs)
```

This metric should balance recall and precision, no? Seems like reporting either F1-score or exact match would be more appropriate. But perhaps I'm missing something. 

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`string_match_all` only evaluates recall but not precision #95

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

string_match_all only evaluates recall but not precision #95

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`string_match_all` only evaluates recall but not precision #95