So, what if you have to migrate from Neo4j to OrientDB?
According to OrientDB’s website migrating your data from Neo4j is pretty staightforward, and it involves only three simple steps:
- Installing Neo4j shell tools
- Exporting your data from Neo4j in GraphML format:
export-graphml -t -o /tmp/out.graphml
- Importing your data in OrientDB from GraphML format:
create database plocal:/tmp/db/testand then
import database /tmp/out.graphml
In practice there are several limitations that may hinder this supposedly simple process. My recent attempt to transfer some of my data from Neo4j to OrientDB demonstrated at least four:
- OrientDB won’t import nodes that have more than one label associated with them in Neo4j. The solution is to transform all your nodes into single-label nodes before exporting from Neo4j.
- OrientDB won’t import nodes with properties called
label. The solution is to rename such properties, if any.
- OrientDB’s console will load the whole GraphML file into Java heap memory before importing it in the database. It will need a maximum heap size at least as large as your GraphML file. The solution is to set the maximum heap size for the console in
$orientdb/bin/console.sh. In my case this meant adding
JAVA_OPTS="-Xmx8192m"to line 43 of the script.
- OrientDB’s console is not good with parallel processing. Although database operations are mostly IO-bound, this turns out to be a limiting factor when importing graph data. The solution is to connect to OrientDB remotely rather than natively when using console. To put it in concrete terms, instead of the suggested
create database plocal:/tmp/db/testyou may want to run the following command:
create database remote:localhost/test USERNAME PASSWORD plocal.
Once you are done, you can sit back and wait for hours while OrientDB imports the data. The average speed seems to be around 1000 records per second for the edges (8 cores - SSD). The vertices are imported at a much higher rate.» Posted on 11 Apr 2015 by Mahmood S. Zargar