Skip to content
karussell edited this page Oct 9, 2012 · 6 revisions

Spatial Index

All spatial indices are using spatial keys to make them more memory efficient.

  • In-memory, simple quadtree - read this blog post to see how this can be made even faster!
  • Very lightweight spatial index. This can be only used in combination with a Graph! E.g. you can have an index with under 30MB for dozens of millions of nodes in an already existing Graph implementation.
  • A Map<Coordinate,Long> via on-disc or in-memory SpatialHashtable. You can read this blog post to get more information about it.

Shortest Path

There are several shortest path implementations available

Integration

You can use GraphHopper algorithms or data structures from other storages as well. As an example you can take a look into the neo4j integration.

To integrate your storage system you need to implement the Storage and the Graph interface. For the Graph it is important that a lookup of edges should not require a 'search' but instead use a direct access to the data. To import data with an external storage you can use this class.

Self-made

There are several implementations of the Graph interface used in the core:

Update - just use GraphStorage and one of the DataAccess (MMapDataAccess, RAMDataAccess) implementations!

  1. the ByteBuffer case (memory mapped or direct memory). Main disadvantage: hard to work with and at the moment not even thread safe for multiple reading threads!
  2. an in-memory graph representation using objects for edges - it is fast!
  3. and a newer in-memory graph using primitive types which is more memory efficient. This class is easier to work with than the ByteBuffer implementation and as memory efficient. But also it is nearly as fast as the in-memory representation and is thread safe for multiple reading threads. But it can easily converted into a thread safe graph even for writing threads - see the readLock and writeLock comments.

To select one or the other storage type you should take a look into OSMReader.

To perform some real work (bidirectional dijkstra) you can use the run.sh script and use DIJKSTRA=true.

Neo4j

When implementing this for neo4j you will use the internal IDs rather than a lucene index where you lookup nodes by any external ID. This will give you dramatic speed-ups. In fact that is exactly the purpose of a graph storage (all in-built implementations work this way) to look up your related items via O(1) instead O(log n) or something, but even a hashmap lookup is O(1) and should be avoided!

Lucene

Although I failed (yet) to implement a fast storage via lucene it should be possible. As the documents could be accessed directly by the document id. The problem with this is that you have to avoid updates. For every update lucene inserts a NEW document with a new document ID and deletes the old document. But the GraphHoppers' OSM-import mechanism uses updates to add edges etc and so

  1. there will be big gaps between the doc ids, which would make dijkstra etc a bit inefficient. We rely on bitsets nearly everywhere to make even big graph traversals memory efficient.
  2. but more important: other refering nodes need an update too (which results in even more updates etc :( ).

The good news is that even external IDs can be made very fast. Have a look into my lumeo project and into this issue.

Clone this wiki locally