Skip to content

Commit 8b39406

Browse files
authored
Update README.md
1 parent be4411b commit 8b39406

File tree

1 file changed

+66
-22
lines changed

1 file changed

+66
-22
lines changed

README.md

Lines changed: 66 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@
33
Converts RDF knowledge graphs to a [Gephi](https://gephi.org/) GEXF file that can be opened in Gephi. GEXF stands for [Graph Exchange XML Format](https://gexf.net/).
44
Supports single RDF file, multiple files in a folder, or remote SPARQL endpoint URL. Can work either in a _"direct and simple conversion"_ mode, turning triples into edges, or using a set of SPARQL queries to define exactly the scope and structure of the nodes and edges that should appear in the Gexf file.
55

6+
Supports attributes on nodes and edges, and supports dynamic graphs generations, with a start and end date on each node and edge.
7+
68
## How to run
79

810
1. Make sure you have Java installed
@@ -39,17 +41,19 @@ The full options of the command are:
3941
Usage: direct [options]
4042
Options:
4143
-e, --endDateProperty
42-
URI of the property in the knowledge grapg holding the end date of
44+
URI of the property in the knowledge graph holding the end date of
4345
entities
4446
* -i, --input
4547
Path to RDF input file, or directory containing RDF files, or URL
4648
of a SPARQL endpoint.
4749
* -o, --output
4850
Path to GEXF output file
4951
-s, --startDateProperty
50-
URI of the property holding the start date of entities
52+
URI of the property in the knowledge graph holding the start date
53+
of entities
5154
-w, --weight
5255
Path to a properties file associating properties to weights
56+
5357
```
5458

5559
### SPARQL-based conversion (preferred)
@@ -75,31 +79,37 @@ The full options of the command are:
7579
Path to the file containing the SPARQL query to retrieve
7680
attributes, e.g. 'sparql/attribute.rq'. The query MUST return 3
7781
columns: the first one is the subject, the second one is the
78-
attribute URI, the third one is the attribute value.
82+
attribute URI, the third one is the attribute value (a literal or
83+
a URI).
7984
-d, --dates
8085
Path to the file containing the SPARQL query to retrieve date
8186
ranges, e.g. 'sparql/dates.rq'
82-
* -e, --edges
87+
-e, --edges
8388
Path to the file containing the SPARQL query to retrieve edges,
8489
e.g. 'sparql/edges.rq'. The query MUST return the following
8590
variables: ?subject, ?edge, ?object
8691
* -i, --input
87-
Path to RDF input file, or directory containing RDF files, or URL
88-
of a SPARQL endpoint.
92+
Path to RDF input file(s), or directory containing RDF files, or
93+
URL of a SPARQL endpoint.
8994
-l, --labels
9095
Path to the file containing the SPARQL query to retrieve labels,
9196
e.g. 'sparql/labels.rq'. The query MUST return the following
9297
variables: ?subject, ?label
9398
* -o, --output
9499
Path to GEXF output file
100+
-p, --parents
101+
Deprecated. Path to the file containing the SPARQL query to
102+
retrieve parents relationship, e.g. 'sparql/parents.rq'
95103
```
96104

97-
*/!\ Attention :* the provided queries MUST follow the rules below:
105+
All queries are optional, and default queries are used if not provided. See below.
98106

99-
#### edges query
107+
*/!\ Attention :* the provided queries MUST follow the following rules
108+
109+
#### --edges / -e query
100110

101111
This query defines the graph structure.
102-
The edges query MUST return the 3 variables: `?subject`, `?edge`, `?object`.
112+
The edges query MUST return the 3 variables: `?subject`, `?edge`, `?object`. _Optionaly_, the query CAN also result the variables `?start` and `?end` which will be interpreted as the dates of the edge in the gexf graph.
103113

104114
An example of such query is:
105115

@@ -115,9 +125,18 @@ WHERE {
115125
}
116126
```
117127

118-
This query is mandatory.
128+
If not provided, the following query is used:
119129

120-
#### labels query
130+
```sparql
131+
# Default edges query
132+
# Selects all the triples in the graph
133+
SELECT ?subject ?edge ?object
134+
WHERE {
135+
?subject ?edge ?object .
136+
}
137+
```
138+
139+
#### --labels / -l query
121140

122141
This query returns the labels of each node in the graph. Typically from an `rdfs:label`, `skos:prefLabel`, or anything.
123142
The labels query MUST use the `?subject` variable to hold the node in the graph, and MUST return the 2 variables `?subject` and `?label`.
@@ -153,7 +172,7 @@ WHERE {
153172
}
154173
```
155174

156-
#### attributes query
175+
#### --attributes / -a query
157176

158177
This query returns the attributes of each node in the graph. Typically the value of `rdf:type`, and other attributes.
159178
The attributes query MUST use the `?subject` variable to hold the node in the graph, and MUST return 3 variables : `?subject`, `?attribute` as the attribute type, and `?value` as the attribute value (a URI or a literal).
@@ -176,33 +195,57 @@ This query is optional. If not provided, the following query is used:
176195
# Default attributes query
177196
# Selects the rdf:type value and any other property pointing to a skos:Concept
178197
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
179-
PREFIX org: <http://www.w3.org/ns/org#>
180-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
181-
PREFIX epvoc: <https://data.europarl.europa.eu/def/epvoc#>
182198
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
183199
SELECT ?subject ?attribute ?value
184200
WHERE {
201+
# The rdf:type is always an attribute
185202
{
186203
?subject a ?value .
187204
BIND(rdf:type AS ?attribute)
188205
}
206+
# Everything that is a skos:Concept is an attribute by default
189207
UNION
190208
{
191-
?subject ?attribute ?value .
192-
?value a skos:Concept .
209+
?subject ?attribute ?concept .
210+
?concept a skos:Concept .
193211
}
212+
}
213+
```
214+
215+
216+
#### --dates /-d query
217+
218+
This query returns the start and end date that will be associated to each node in the graph. For edges, the start and end date can be provided in the `--edges` query.
219+
The dates query MUST use the `?subject` variable to hold the node in the graph, and MUST return a `?start` and `?end` variables. Only `?start` or `?end` can be returned, in which case the corresponding node will not have a start or end date associated.
220+
221+
An example of such query is:
222+
223+
```sparql
224+
PREFIX rico: <https://www.ica.org/standards/RiC/ontology#>
225+
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
226+
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
227+
SELECT ?subject ?start ?end
228+
WHERE {
229+
?subject a ?type .
230+
OPTIONAL { ?subject rico:beginningDate ?start . }
231+
OPTIONAL { ?subject rico:endDate ?end . }
194232
}
195233
```
196234

235+
This query is optional. If not provided, no start and end date will be associated to the nodes.
236+
237+
#### --parents /-p query
197238

198-
#### dates query
239+
**This is discouraged** since Gephi does not support hierarchical graphs anymore. But this could be useful for other tools, or with older versions of Gephi.
240+
The parents query MUST use the `?subject` variable to hold the node in the graph, and MUST return a `?parent` variable that will hold the parent of that node in the graph.
241+
This query is optional. If not provided, no default query is used and nodes will not have a parent in the graph.
199242

200-
TODO
201243

202244
## Support for dynamic graphs
203245

204-
rdf2gephi supports the creation of dynamic graphs where we can see the evolution of the graph over time.
205-
TODO
246+
rdf2gephi supports the creation of dynamic graphs where we can see the evolution of the graph over time. For this:
247+
1. To associate dates to edges : In the `--edges` query, return a `?start` and `?end` variables
248+
2. To associate dates to nodes : provide a `--dates` query
206249

207250
## Typical actions in Gephi to view your RDF graph
208251

@@ -211,7 +254,8 @@ TODO
211254
3. Size the nodes based on (incoming or outgoing) degree : Appearance > Size icon > Ranking > Degree
212255
4. Print labels only of biggest nodes : Filter > Topology > Degree Range > drag and drop to Queries below > set the parameters. Then click on filter. Then click on icon above "hide node/edges labels if not in filtered graph"
213256
5. Click on "Show node labels" button
214-
6. Go in "Preview" tab, regenerate the preview, export as SVG/PNG/PDF
257+
6. You could also apply a clustering algorithm : go to Statistics > Community detection > Modularity. Then apply node color following the modularity attribute.
258+
7. Go in "Preview" tab, regenerate the preview, export as SVG/PNG/PDF
215259

216260
This is illustrated in the screencast below:
217261

0 commit comments

Comments
 (0)