FREDA

Fast Relation Extraction Data Annotation

See our paper on Arxiv (accepted for the Knowledge and Natural Language Processing track at ACM SAC 2023): https://arxiv.org/abs/2204.07150

FREDA can be used to manually annotate sentences quickly and accurately. A simple procedure for sentence acquisition from a partially annotated Wikipedia-based corpus is provided to be able to create datasets for new relations.

In addition, FREDA can also be used to annotate datasets for the tasks Named Entity Recognition, Co-reference Resolution and Entity Linking.

Current database (database/main.tar.xz) contains at least 500 annotated sentences for 19 relations. In addition, four more relations are added without any annotations so far.

Android app

Please check out the app in the Playstore: https://play.google.com/store/apps/details?id=ca.freda.relation_annotator

Email mstrobl@ualberta.ca for an account.

Acknowledgements

We would like to thank all the data annotators for their hard work towards creating these datasets.

Evaluation

Please see Michael Strobl's PhD thesis (link will be added once it's published) for an evaluation against the open-source system BRAT (https://brat.nlplab.org/). Here are links to the videos the evaluation is based on:

BRAT RE: https://drive.google.com/file/d/1q5MKxxk5kSgVGn_VDt6Fif6HWHoMvkFL/view?usp=share_link

BRAT CR: https://drive.google.com/file/d/16Vi2m-Nhz-2MZXhZYFb9Xv2ppfb453Eu/view?usp=share_link

BRAT NER: https://drive.google.com/file/d/1h9Y2R2F05mF6ZitQRVHX3eDw3uiBZf4d/view?usp=share_link

FREDA RE: https://drive.google.com/file/d/1vs6VIssuYI98NeT3k25dgxvyGY50NtHS/view?usp=share_link

FREDA CR: https://drive.google.com/file/d/1vzVaXbluN_ixU5ELa7h_SDdqjsRmneJy/view?usp=share_link

FREDA NER: https://drive.google.com/file/d/1w0_wLld92Hw82tdF90VIUaq0xOuerKSq/view?usp=share_link

A stopwatch was added to each video as a sanity check. For the annotations conducted on the Android device, the clock can be seen on top.

Get Started

(In offline mode, i.e. demo mode, the remaining steps can be skipped, apart from uploading the Android app to your device or an emulator. Just select the task to annotate for, examples are provided. However, this mode is only for demonstration purposes, annotations are not stored.)

Decompress database in database/:

tar xf database/main.tar.xz

Print relations and number of sentences with yes/no responses from current database:

python server/database_statistics.py

Configuration

Server config

In config/config.json:

If database path changed, replace database value accordingly.
Replace port and ip with your desired port and the ip address of your machine.

Android application config

Replace SERVERPORT and SERVERIP in application/app.src/main/java/ca/freda/relation_annotator/handler/ClientHandler.java with your values.

Start Annotations for Relation Extraction

Server needs to be running when app is used.

Server

Start server: python server_v1/main.py

Android Application

Download "Android Studio" from https://developer.android.com/studio.
Open project (application/ subdirectory) in Android Studio.
Create Emulator in AVD Manager (e.g. Samsung A10).
Start app.

Create Data for New Relations

These modules are able to create new relations and extract sentences from WEXEA based on keywords and distant supervision using SPARQL queries and DBpedia. If relations do not exist in DBPedia, a keyword-based only approach can be used.

CoreNLP Server

This is used to find dates in text.

Please download the CoreNLP tool from: https://stanfordnlp.github.io/CoreNLP/download.html

Follow the instructions to start the server: https://stanfordnlp.github.io/CoreNLP/corenlp-server.html

Keywords

Modify database/keywords.json and add new tuple with relation name, info (presentend in the app to make sure subject/object are annotated accordingly), direction of relation and list of keywords.
De-compress database/wexea.tar.xz (contains 10,000 articles, should be replaced with original dataset for production system).
Run python database/sentence_extractor_keywords.py (will take around 12h or longer, depending on how many keywords are used)

Distant Supervision

Create DBpedia RDF store

Download and decompress DBpedia infobox properties: https://downloads.dbpedia.org/repo/dbpedia/generic/infobox-properties/2020.12.01/infobox-properties_lang=en.ttl.bz2
Download Apache Jena (we used the latest version 3.17.0): https://jena.apache.org/download/index.cgi
Set Jena home: export JENA_HOME=<PATH TO apache-jena-3.17.0/>
Create RDF store: tdbloader2 --loc <LOCATION OF OUTPUT RDF STORE> infobox-properties_lang\=en.ttl
Download Jena JDBC driver: https://search.maven.org/artifact/org.apache.jena/jena-jdbc-driver-bundle/3.17.0/jar
Add jena-jdbc-driver-bundle-3.17.0.jar to Java CLASSPATH.

Extract sentences

Modify database/queries.json and add a list of SPARQL queries for each relation, where applicable. Make sure that the relation added here already exists in the database (through running the aforementioned keyword extractor). Also all previous annotations on these relations are deleted, therefore it makes sense to create relations with keywords first and rightafter run this part.
Run python database/sentence_extractor_distant_supervision.py

It is possible that some SPARQL queries are either too generic or too specific. They can be tested beforehand, e.g. on https://yasgui.triply.cc/.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
application/app/src/main		application/app/src/main
config		config
data_creators		data_creators
database		database
server_v1		server_v1
server_v2		server_v2
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FREDA

Android app

Acknowledgements

Evaluation

Get Started

Configuration

Server config

Android application config

Start Annotations for Relation Extraction

Server

Android Application

Create Data for New Relations

CoreNLP Server

Keywords

Distant Supervision

Create DBpedia RDF store

Extract sentences

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

mjstrobl/FREDA

Folders and files

Latest commit

History

Repository files navigation

FREDA

Android app

Acknowledgements

Evaluation

Get Started

Configuration

Server config

Android application config

Start Annotations for Relation Extraction

Server

Android Application

Create Data for New Relations

CoreNLP Server

Keywords

Distant Supervision

Create DBpedia RDF store

Extract sentences

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages