You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+66-22Lines changed: 66 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,8 @@
3
3
Converts RDF knowledge graphs to a [Gephi](https://gephi.org/) GEXF file that can be opened in Gephi. GEXF stands for [Graph Exchange XML Format](https://gexf.net/).
4
4
Supports single RDF file, multiple files in a folder, or remote SPARQL endpoint URL. Can work either in a _"direct and simple conversion"_ mode, turning triples into edges, or using a set of SPARQL queries to define exactly the scope and structure of the nodes and edges that should appear in the Gexf file.
5
5
6
+
Supports attributes on nodes and edges, and supports dynamic graphs generations, with a start and end date on each node and edge.
7
+
6
8
## How to run
7
9
8
10
1. Make sure you have Java installed
@@ -39,17 +41,19 @@ The full options of the command are:
39
41
Usage: direct [options]
40
42
Options:
41
43
-e, --endDateProperty
42
-
URI of the property in the knowledge grapg holding the end date of
44
+
URI of the property in the knowledge graph holding the end date of
43
45
entities
44
46
* -i, --input
45
47
Path to RDF input file, or directory containing RDF files, or URL
46
48
of a SPARQL endpoint.
47
49
* -o, --output
48
50
Path to GEXF output file
49
51
-s, --startDateProperty
50
-
URI of the property holding the start date of entities
52
+
URI of the property in the knowledge graph holding the start date
53
+
of entities
51
54
-w, --weight
52
55
Path to a properties file associating properties to weights
56
+
53
57
```
54
58
55
59
### SPARQL-based conversion (preferred)
@@ -75,31 +79,37 @@ The full options of the command are:
75
79
Path to the file containing the SPARQL query to retrieve
76
80
attributes, e.g. 'sparql/attribute.rq'. The query MUST return 3
77
81
columns: the first one is the subject, the second one is the
78
-
attribute URI, the third one is the attribute value.
82
+
attribute URI, the third one is the attribute value (a literal or
83
+
a URI).
79
84
-d, --dates
80
85
Path to the file containing the SPARQL query to retrieve date
81
86
ranges, e.g. 'sparql/dates.rq'
82
-
* -e, --edges
87
+
-e, --edges
83
88
Path to the file containing the SPARQL query to retrieve edges,
84
89
e.g. 'sparql/edges.rq'. The query MUST return the following
85
90
variables: ?subject, ?edge, ?object
86
91
* -i, --input
87
-
Path to RDF input file, or directory containing RDF files, or URL
88
-
of a SPARQL endpoint.
92
+
Path to RDF input file(s), or directory containing RDF files, or
93
+
URL of a SPARQL endpoint.
89
94
-l, --labels
90
95
Path to the file containing the SPARQL query to retrieve labels,
91
96
e.g. 'sparql/labels.rq'. The query MUST return the following
92
97
variables: ?subject, ?label
93
98
* -o, --output
94
99
Path to GEXF output file
100
+
-p, --parents
101
+
Deprecated. Path to the file containing the SPARQL query to
102
+
retrieve parents relationship, e.g. 'sparql/parents.rq'
95
103
```
96
104
97
-
*/!\ Attention :* the provided queries MUST follow the rules below:
105
+
All queries are optional, and default queries are used if not provided. See below.
98
106
99
-
#### edges query
107
+
*/!\ Attention :* the provided queries MUST follow the following rules
108
+
109
+
#### --edges / -e query
100
110
101
111
This query defines the graph structure.
102
-
The edges query MUST return the 3 variables: `?subject`, `?edge`, `?object`.
112
+
The edges query MUST return the 3 variables: `?subject`, `?edge`, `?object`._Optionaly_, the query CAN also result the variables `?start` and `?end` which will be interpreted as the dates of the edge in the gexf graph.
103
113
104
114
An example of such query is:
105
115
@@ -115,9 +125,18 @@ WHERE {
115
125
}
116
126
```
117
127
118
-
This query is mandatory.
128
+
If not provided, the following query is used:
119
129
120
-
#### labels query
130
+
```sparql
131
+
# Default edges query
132
+
# Selects all the triples in the graph
133
+
SELECT ?subject ?edge ?object
134
+
WHERE {
135
+
?subject ?edge ?object .
136
+
}
137
+
```
138
+
139
+
#### --labels / -l query
121
140
122
141
This query returns the labels of each node in the graph. Typically from an `rdfs:label`, `skos:prefLabel`, or anything.
123
142
The labels query MUST use the `?subject` variable to hold the node in the graph, and MUST return the 2 variables `?subject` and `?label`.
@@ -153,7 +172,7 @@ WHERE {
153
172
}
154
173
```
155
174
156
-
#### attributes query
175
+
#### --attributes / -a query
157
176
158
177
This query returns the attributes of each node in the graph. Typically the value of `rdf:type`, and other attributes.
159
178
The attributes query MUST use the `?subject` variable to hold the node in the graph, and MUST return 3 variables : `?subject`, `?attribute` as the attribute type, and `?value` as the attribute value (a URI or a literal).
@@ -176,33 +195,57 @@ This query is optional. If not provided, the following query is used:
176
195
# Default attributes query
177
196
# Selects the rdf:type value and any other property pointing to a skos:Concept
# Everything that is a skos:Concept is an attribute by default
189
207
UNION
190
208
{
191
-
?subject ?attribute ?value .
192
-
?value a skos:Concept .
209
+
?subject ?attribute ?concept .
210
+
?concept a skos:Concept .
193
211
}
212
+
}
213
+
```
214
+
215
+
216
+
#### --dates /-d query
217
+
218
+
This query returns the start and end date that will be associated to each node in the graph. For edges, the start and end date can be provided in the `--edges` query.
219
+
The dates query MUST use the `?subject` variable to hold the node in the graph, and MUST return a `?start` and `?end` variables. Only `?start` or `?end` can be returned, in which case the corresponding node will not have a start or end date associated.
This query is optional. If not provided, no start and end date will be associated to the nodes.
236
+
237
+
#### --parents /-p query
197
238
198
-
#### dates query
239
+
**This is discouraged** since Gephi does not support hierarchical graphs anymore. But this could be useful for other tools, or with older versions of Gephi.
240
+
The parents query MUST use the `?subject` variable to hold the node in the graph, and MUST return a `?parent` variable that will hold the parent of that node in the graph.
241
+
This query is optional. If not provided, no default query is used and nodes will not have a parent in the graph.
199
242
200
-
TODO
201
243
202
244
## Support for dynamic graphs
203
245
204
-
rdf2gephi supports the creation of dynamic graphs where we can see the evolution of the graph over time.
205
-
TODO
246
+
rdf2gephi supports the creation of dynamic graphs where we can see the evolution of the graph over time. For this:
247
+
1. To associate dates to edges : In the `--edges` query, return a `?start` and `?end` variables
248
+
2. To associate dates to nodes : provide a `--dates` query
206
249
207
250
## Typical actions in Gephi to view your RDF graph
208
251
@@ -211,7 +254,8 @@ TODO
211
254
3. Size the nodes based on (incoming or outgoing) degree : Appearance > Size icon > Ranking > Degree
212
255
4. Print labels only of biggest nodes : Filter > Topology > Degree Range > drag and drop to Queries below > set the parameters. Then click on filter. Then click on icon above "hide node/edges labels if not in filtered graph"
213
256
5. Click on "Show node labels" button
214
-
6. Go in "Preview" tab, regenerate the preview, export as SVG/PNG/PDF
257
+
6. You could also apply a clustering algorithm : go to Statistics > Community detection > Modularity. Then apply node color following the modularity attribute.
258
+
7. Go in "Preview" tab, regenerate the preview, export as SVG/PNG/PDF
0 commit comments