You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Continuing the investigations from #198 and #222, we should look at a way to better identify false positive boreholes (document pages that contain similar material descriptions but that are not themselves boreholes). For example: 35613_22.pdf
In this issue we should look into absence of verbs in the text as indication that the document in question is a borehole.
As an example we should aim to eliminate similar false positives to the file: 45004_8.pdf:
Pages where data is extracted but aren't actual boreprofiles are available on the s3 bucket under false_positive_boreprofiles