In contrast to the RDMS (Relational Database Management System), where data objects are the main part, in a Graph Database, the relations between such data objects are playing the main role and are represented as dedicated objects which gives better performance especially when you have a lot of small data pieces tied to each other.
One of the first graph database systems was the Neo4j which will be examined in this post.
For queries, Neo4j uses the Cypher Query language with the cypher-shell
tool, and to access a Neo4j database via common web-browser it has built-in UI. Also, Neo4j supports REST API.
Neo4j is distributed by the paid model, but it has free Community Edition with some limitations (no clustering, no online backups, only one user database, no scaling, etc), plus SaaS Aura. See their comparison тут>>>.
So, in this post, we will spin up the Neo4j Community Edition instance with Docker, will take a brief overview of its query language, and how a backup-restore can be performed.
Contents
Running Neo4j with Docker
Let’s run a container with Docker on a working laptop to see how it’s working. See the documentation here>>>.
[simterm]
$ docker run --rm --name neo4j -p 7474:7474 -p 7687:7687 neo4j:latest ... Directories in use: home: /var/lib/neo4j config: /var/lib/neo4j/conf logs: /logs plugins: /var/lib/neo4j/plugins import: /var/lib/neo4j/import data: /var/lib/neo4j/data certificates: /var/lib/neo4j/certificates run: /var/lib/neo4j/run Starting Neo4j. ... 2020-07-27 10:11:30.394+0000 INFO Bolt enabled on 0.0.0.0:7687. 2020-07-27 10:11:31.640+0000 INFO Remote interface available at http://localhost:7474/ 2020-07-27 10:11:31.640+0000 INFO Started
[/simterm]
Check it – open a browser, navigate to the http://localhost:7474, and log in with the default login-pass neo4j:neo4j:
Admin password
To set a new password – use the --env NEO4J_AUTH
:
[simterm]
$ docker run --rm --name neo4j --env NEO4J_AUTH=neo4j/pass -p 7474:7474 -p 7687:7687 neo4j:latest Changed password for user 'neo4j'. ...
[/simterm]
cypher-shell
To work with the databases you can use REST API or a local tool – cypher-shell
.
Connect to the container and ruin the shell:
[simterm]
$ docker exec -ti neo4j cypher-shell -u neo4j -p pass Connected to Neo4j 4.1.0 at neo4j://localhost:7687 as user neo4j. Type :help for a list of available commands or :exit to exit the shell. Note that Cypher queries must end with a semicolon. neo4j@neo4j>
[/simterm]
Neo4j configuration file
In the container, the main configuration file located at the $NEO4J_HOME/conf/neo4j.conf
path, e.g. /var/lib/neo4j/conf/neo4j.conf
:
[simterm]
root@65d8061ac13e:/var/lib/neo4j# head /var/lib/neo4j/conf/neo4j.conf #***************************************************************** # Neo4j configuration # # For more details and a complete list of settings, please see # https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/ #***************************************************************** # The name of the default database #dbms.default_database=neo4j
[/simterm]
To redefine any setting – mount anew config file to the /conf
directory of the container.
All settings for the neo4j.conf
can be found here>>>.
To get current config from the shell – use the dbms.listConfig()
call:
[simterm]
neo4j@neo4j> CALL dbms.listConfig() YIELD name, value WHERE name STARTS WITH 'dbms.default' RETURN name, value ORDER BY name LIMIT 3; +-------------------------------------------------+ | name | value | +-------------------------------------------------+ | "dbms.default_advertised_address" | "localhost" | | "dbms.default_database" | "neo4j" | | "dbms.default_listen_address" | "0.0.0.0" | +-------------------------------------------------+ 3 rows available after 216 ms, consumed after another 13 ms
[/simterm]
cypher-shell
&& CQL
CREATE
Let’s play with data.
There is a great tutorial of the data types on the Tutorialspoint here>>>.
Create a new node:
[simterm]
neo4j@neo4j> create (test); 0 rows available after 56 ms, consumed after another 0 ms Added 1 nodes
[/simterm]
DELETE
Delete it:
[simterm]
neo4j@neo4j> MATCH (test) DETACH DELETE test; 0 rows available after 32 ms, consumed after another 0 ms Deleted 1 nodes
[/simterm]
To delete all records from a database – use the (n)
:
[simterm]
neo4j@neo4j> MATCH (n) detach delete n;
[/simterm]
Labels
Create a node with the label1 label with the Properties which holds two keys – key1 and key2:
[simterm]
neo4j@neo4j> create (node1:label1 {key1: "value1", key2: "value2"} ); 0 rows available after 47 ms, consumed after another 0 ms Added 1 nodes, Set 2 properties, Added 1 labels
[/simterm]
Check it:
[simterm]
neo4j@neo4j> MATCH (node1) RETURN node1; +--------------------------------------------+ | node1 | +--------------------------------------------+ | (:label1 {key1: "value1", key2: "value2"}) | +--------------------------------------------+
[/simterm]
Or by using RETURN
– get the node right after creation, in the same query:
[simterm]
neo4j@neo4j> CREATE (node2:label2 {key1: "value1", key2: "value2"} ) RETURN node2; +--------------------------------------------+ | node2 | +--------------------------------------------+ | (:label2 {key1: "value1", key2: "value2"}) | +--------------------------------------------+
[/simterm]
Check from the browser using match(n) return n
to display all the records:
Relations
A new relationship can be created between any new nodes, or between already existing.
To create a Relation between new nodes – add the -[r:RelationName]->
:
[simterm]
neo4j@neo4j> create (node3:label3 {key1: "value1", key2: "value2"}) -[r:RelationName]-> (node4:label4{key1: "value1", key2: "value2"}) RETURN node3, node4; +-----------------------------------------------------------------------------------------+ | node3 | node4 | +-----------------------------------------------------------------------------------------+ | (:label3 {key1: "value1", key2: "value2"}) | (:label4 {key1: "value1", key2: "value2"}) | +-----------------------------------------------------------------------------------------+ 1 row available after 88 ms, consumed after another 8 ms Added 2 nodes, Created 1 relationships, Set 4 properties, Added 2 labels
[/simterm]
Check it:
To create a Relation between already existing nodes – use MATCH
to select those nodes:
[simterm]
neo4j@neo4j> MATCH (node3:label3), (node4:label4) CREATE (node3) -[r:RelationName2]-> (node4) RETURN node3, node4; +-----------------------------------------------------------------------------------------+ | node3 | node4 | +-----------------------------------------------------------------------------------------+ | (:label3 {key1: "value1", key2: "value2"}) | (:label4 {key1: "value1", key2: "value2"}) | +-----------------------------------------------------------------------------------------+ 1 row available after 124 ms, consumed after another 9 ms Created 1 relationships
Backup && Restore
Data is stored in the $NEO4J_HOME/data
which is actually a symlink to the /data
, see here>>>.
Check directories:
[simterm]
root@65d8061ac13e:/var/lib/neo4j# ls -l /var/lib/neo4j/data lrwxrwxrwx 1 root root 5 Jul 23 09:01 /var/lib/neo4j/data -> /data root@65d8061ac13e:/var/lib/neo4j# ls -l /data/ total 12 drwxrwxrwx 4 neo4j neo4j 4096 Jul 27 11:19 databases drwxr-xr-x 2 neo4j neo4j 4096 Jul 27 11:19 dbms drwxrwxrwx 4 neo4j neo4j 4096 Jul 27 11:19 transactions
[/simterm]
Databases files are stored in the databases
directory, where you can find two default databases – the system and neo4j, which can be found with the show databases
:
[simterm]
neo4j@neo4j> show databases; +------------------------------------------------------------------------------------------------+ | name | address | role | requestedStatus | currentStatus | error | default | +------------------------------------------------------------------------------------------------+ | "neo4j" | "localhost:7687" | "standalone" | "online" | "online" | "" | TRUE | | "system" | "localhost:7687" | "standalone" | "online" | "online" | "" | FALSE | +------------------------------------------------------------------------------------------------+
[/simterm]
The system database is used for the… Well, for the system itself, while nedo4j is the default user database.
Neo4j dump
Create a new directories which will hold our data:
[simterm]
$ mkdir -p /tmp/neo4/{data,logs}
[/simterm]
Restart the Neo4j container, mount those directories to it:
[simterm]
$ docker run --rm --name neo4j --env NEO4J_AUTH=neo4j/pass -p 7474:7474 -p 7687:7687 -v /tmp/neo4/data/:/data -v /tmp/neo4/logs/:/logs neo4j:latest Changed password for user 'neo4j'. Directories in use: home: /var/lib/neo4j config: /var/lib/neo4j/conf logs: /logs plugins: /var/lib/neo4j/plugins import: /var/lib/neo4j/import data: /var/lib/neo4j/data certificates: /var/lib/neo4j/certificates run: /var/lib/neo4j/run ...
[/simterm]
Check the data on the host:
[simterm]
$ ll /tmp/neo4/data/databases/ total 0 drwxr-xr-x 2 7474 7474 720 Jul 27 16:07 neo4j -rw-r--r-- 1 7474 7474 0 Jul 27 16:07 store_lock drwxr-xr-x 3 7474 7474 740 Jul 27 16:07 system
[/simterm]
Connet, create a new record:
[simterm]
$ docker exec -ti neo4j cypher-shell -u neo4j -p pass neo4j@neo4j> create (test:tobackup); 0 rows available after 131 ms, consumed after another 0 ms Added 1 nodes
[/simterm]
To create a database dump you first need to stop the instance (as the Community Edition doesn’t have ability for the online backups):
[simterm]
root@771f04312148:/var/lib/neo4j# neo4j-admin dump --database=neo4j --to=/data/backups/ The database is in use. Stop database 'neo4j' and try again.
[/simterm]
So, exit from the container and stop it:
[simterm]
$ docker stop neo4j neo4j
[/simterm]
Start it over but at this time add the bash
command to prevent Neo4j service from starting:
[simterm]
$ docker run -ti --rm --name neo4j --env NEO4J_AUTH=neo4j/pass -p 7474:7474 -p 7687:7687 -v /tmp/neo4/data/:/data -v /tmp/neo4/logs/:/logs neo4j:latest bash neo4j@6d4e9854bc1d:~$
[/simterm]
Create a dump:
[simterm]
neo4j@015ba14bdba2:~$ mkdir /data/backup neo4j@015ba14bdba2:~$ neo4j-admin dump --database=neo4j --to=/data/backup/ Done: 34 files, 250.8MiB processed.
[/simterm]
Check it:
[simterm]
neo4j@015ba14bdba2:~$ ls -l /data/backup/ total 12 -rw-r--r-- 1 neo4j neo4j 9971 Jul 27 13:46 neo4j.dump
[/simterm]
Restore
On the host create a new set of directories – for the second Neo4j instance:
[simterm]
$ mkdir -p /tmp/neo4-2/{data,logs}
[/simterm]
Copy the backups directory from the first one:
[simterm]
$ sudo cp -r /tmp/neo4/data/backup/ /tmp/neo4-2/data/
[/simterm]
Run the service as usual, mount the /tmp/neo4-2
, replace ports and its name:
[simterm]
$ docker run --rm --name neo4j-2 --env NEO4J_AUTH=neo4j/pass -p 7475:7474 -p 7688:7687 -v /tmp/neo4-2/data/:/data -v /tmp/neo4-2/logs/:/logs neo4j:latest
[/simterm]
Connect and check the data:
[simterm]
$ docker exec -ti neo4j-2 cypher-shell -u neo4j -p pass Connected to Neo4j 4.1.0 at neo4j://localhost:7687 as user neo4j. Type :help for a list of available commands or :exit to exit the shell. Note that Cypher queries must end with a semicolon. neo4j@neo4j> match (n) return n; +---+ | n | +---+ +---+
[/simterm]
Okay – nothing found here as this is a brand new database.
Exit from the container, stop it and run over with the bash
:
[simterm]
$ docker run -ti --rm --name neo4j-2 --env NEO4J_AUTH=neo4j/pass -p 7475:7474 -p 7688:7687 -v /tmp/neo4-2/data/:/data -v /tmp/neo4-2/logs/:/logs neo4j:latest bash neo4j@b0f324cb7c9b:~$
[/simterm]
Load the dump to the database with the --force
key as the default neo4j database already present:
[simterm]
neo4j@7bca892e9538:~$ neo4j-admin load --from=/data/backup/neo4j.dump --database=neo4j --force Done: 34 files, 250.8MiB processed.
[/simterm]
Exit, restart container again in the normal way to start the Neo4j process:
[simterm]
$ docker run -ti --rm --name neo4j-2 --env NEO4J_AUTH=neo4j/pass -p 7475:7474 -p 7688:7687 -v /tmp/neo4-2/data/:/data -v /tmp/neo4-2/logs/:/logs neo4j:latest
[/simterm]
Connect, check:
[simterm]
neo4j@neo4j> match (n) return n; +-------------+ | n | +-------------+ | (:tobackup) | +-------------+
[/simterm]
Our record is on its place – all done.