neo4j是一个高性能的面向图的数据库,据说是java世界“最受欢迎的图数据库”,虽然,图数据库本身也没有几种可以选......。

neo4j的高可用很容易搭建,本文使用的是neo4j-2.0.2版本,在其官方网站的manual中有非常详细的例子,参考其manual的ch23,http://docs.neo4j.org/chunked/2.0.2/ha-setup-tutorial.html。其中提到了2个情况下的例子,一个是真正的生产情况下,采用多台机器部署的情况;另一个是开发环境中,在一台机器上启动多个服务模拟生产的高可用。本文采用的是后一种,此次实验依然是在ubuntu 64bit上进行。

先去官网上下载neo4j的企业版,地址在:http://dist.neo4j.org/neo4j-enterprise-2.0.2-unix.tar.gz。由于neo4j是用java写的,所以,没有JRE是不行的,安装Java的过程在这里略过,我用的是目前最新发布的java 8。

XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2$ java -version
java version "1.8.0"
Java(TM) SE Runtime Environment (build 1.8.0-b132)
Java HotSpot(TM) 64-Bit Server VM (build 25.0-b70, mixed mode)

下载之后解压,需要用root用户安装neo4j的service,

XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2$ sudo ./bin/neo4j-installer install
WARNING: this installer is deprecated and may not be the optimal way to install Neo4j on your system.
  Please see the Neo4j Manual for up to date information on installing Neo4j.
Press any key to continue
Graph-like power should be handled carefully. What user should run Neo4j? [neo4j] XXXXX
 Adding system startup for /etc/init.d/neo4j-service ...
   /etc/rc0.d/K20neo4j-service -> ../init.d/neo4j-service
   /etc/rc1.d/K20neo4j-service -> ../init.d/neo4j-service
   /etc/rc6.d/K20neo4j-service -> ../init.d/neo4j-service
   /etc/rc2.d/S20neo4j-service -> ../init.d/neo4j-service
   /etc/rc3.d/S20neo4j-service -> ../init.d/neo4j-service
   /etc/rc4.d/S20neo4j-service -> ../init.d/neo4j-service
   /etc/rc5.d/S20neo4j-service -> ../init.d/neo4j-service

安装的时候有一个提示非常重要,让你选哪个用户是neo4j的主用户,默认是neo4j,一般情况下你系统里面不会有这样一个用户的,所以一定要改为你常用的那个用户。之后就可以启动neo4j

XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2$ service neo4j-service status
Neo4j Server is not running
XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2$ service neo4j-service start
WARNING: Max 1024 open files allowed, minimum of 40 000 recommended. See the Neo4j manual.
Using additional JVM arguments:  -server -XX:+DisableExplicitGC -Dorg.neo4j.server.properties=conf/neo4j-server.properties -Djava.util.logging.config.file=conf/logging.properties -Dlog4j.configuration=file:conf/log4j.properties -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
Starting Neo4j Server...WARNING: not changing user
process [14512]... waiting for server to be ready................ OK.
http://localhost:7474/ is ready.
XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2$ service neo4j-service status
Neo4j Server is running at pid 14512
XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2$

启动时会打印一些比较简单的log,打印了进程号。

如果要做高可用的话,需要多个neo4j的数据库,把解压的这个目录复制2份,这样就有3个内容一样的目录,用于高可用的配置。neo4j的高可用主要需要修改conf/neo4j.properties和conf/neo4j-server.properties这2个文件。节点1需要修改的配置如下,

conf/neo4j.properties:

# Unique server id for this Neo4j instance
# can not be negative id and must be unique
ha.server_id = 1

# IP and port for this instance to bind to for communicating data with the
# other neo4j instances in the cluster.
ha.server = 127.0.0.1:6363
online_backup_server = 127.0.0.1:6366

# IP and port for this instance to bind to for communicating cluster information
# with the other neo4j instances in the cluster.
ha.cluster_server = 127.0.0.1:5001

# List of other known instances in this cluster
ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003



conf/neo4j-server.properties

# database location
org.neo4j.server.database.location=data/graph.db

# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7474

# webserver IP bind
org.neo4j.server.webserver.address=0.0.0.0

# https port (for all data, administrative, and UI access)
org.neo4j.server.webserver.https.port=7484

# HA - High Availability
# SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA

neo4j中有一个关于IP的配置,0.0.0.0,表示neo4j需要监听该机器上的所有ip,一般对neo4j提供的web管理功能,都会采用这种方式,即配置org.neo4j.server.webserver.address这个参数。

节点2和节点3上的配置比较类似,只列出节点2的配置,如下,

conf/neo4j.properties:

# Unique server id for this Neo4j instance
# can not be negative id and must be unique
ha.server_id = 2

# IP and port for this instance to bind to for communicating data with the
# other neo4j instances in the cluster.
ha.server = 127.0.0.1:6364
online_backup_server = 127.0.0.1:6367

# IP and port for this instance to bind to for communicating cluster information
# with the other neo4j instances in the cluster.
ha.cluster_server = 127.0.0.1:5002

# List of other known instances in this cluster
ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003



conf/neo4j-server.properties

# database location
org.neo4j.server.database.location=data/graph.db

# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7475

# webserver IP bind
org.neo4j.server.webserver.address=0.0.0.0

# https port (for all data, administrative, and UI access)
org.neo4j.server.webserver.https.port=7485

# HA - High Availability
# SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA

配置完成之后,就可以启动包含3个neo4j节点的HA服务。启动步骤如下

XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2$ ./bin/neo4j start
WARNING: Max 1024 open files allowed, minimum of 40 000 recommended. See the Neo4j manual.
Using additional JVM arguments:  -server -XX:+DisableExplicitGC -Dorg.neo4j.server.properties=conf/neo4j-server.properties -Djava.util.logging.config.file=conf/logging.properties -Dlog4j.configuration=file:conf/log4j.properties -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
Starting Neo4j Server...WARNING: not changing user
HA instance started in process [20002]. Will be operational once connected to peers. See /home/XXXXX/neo4j-enterprise-2.0.2/data/log/console.log for current status.
XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2$ cd ../neo4j-enterprise-2.0.2-node2/
XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2-node2$ ./bin/neo4j start
WARNING: Max 1024 open files allowed, minimum of 40 000 recommended. See the Neo4j manual.
Using additional JVM arguments:  -server -XX:+DisableExplicitGC -Dorg.neo4j.server.properties=conf/neo4j-server.properties -Djava.util.logging.config.file=conf/logging.properties -Dlog4j.configuration=file:conf/log4j.properties -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
Starting Neo4j Server...WARNING: not changing user
HA instance started in process [22238]. Will be operational once connected to peers. See /home/XXXXX/neo4j-enterprise-2.0.2-node2/data/log/console.log for current status.
XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2-node2$ cd ../neo4j-enterprise-2.0.2-node3/
XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2-node3$ ./bin/neo4j start
WARNING: Max 1024 open files allowed, minimum of 40 000 recommended. See the Neo4j manual.
Using additional JVM arguments:  -server -XX:+DisableExplicitGC -Dorg.neo4j.server.properties=conf/neo4j-server.properties -Djava.util.logging.config.file=conf/logging.properties -Dlog4j.configuration=file:conf/log4j.properties -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
Starting Neo4j Server...WARNING: not changing user
HA instance started in process [22605]. Will be operational once connected to peers. See /home/XXXXX/neo4j-enterprise-2.0.2-node3/data/log/console.log for current status.
XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2-node3$ jps
20002 Bootstrapper
22238 Bootstrapper
22605 Bootstrapper
23805 Jps
XXXXX@XXXXX-asus:~/neo4j-enterprise-2.0.2-node3$

最后用jps查看进程,发现有3个Bootstrapper进程,这个就是neo4j节点的进程。启动完成之后,可以用web方式登陆http://localhost:7474/webadmin/进行管理,如果直接登陆http://localhost:7474/返回的是http://localhost:7474/browser/,这个页面中能做的事情就很少,只可以运行一些简单的查询。

webadmin页面有5个标签,如下图

neo4j创建节点并建立关系 python_java

在最后一个标签“Server Info”中有一项“High Availability”,可以查看HA运行的信息,以下是节点1的信息,

neo4j创建节点并建立关系 python_IP_02

在第3个标签“Console”中可以敲入Cypher(neo4j的查询语言),用于管理。

我们可以在节点1中使用Cypher插入数据,

neo4j-sh (?)$ CREATE (ee { name: "Emil", from: "Sweden" }) RETURN ee.name;
==> +---------+
==> | ee.name |
==> +---------+
==> | "Emil"  |
==> +---------+
==> 1 row
==> Nodes created: 1
==> Properties set: 2
==> 847 ms
neo4j-sh (?)$ START ee=node(*) WHERE ee.name! = "Emil" RETURN ee;
==> SyntaxException: This syntax is no longer supported (missing properties are now returned as null). (line 1, column 27)
==> "START ee=node(*) WHERE ee.name! = "Emil" RETURN ee"
==>                            ^
neo4j-sh (?)$ START a = node(0) RETURN a;
==> +------------------------------------+
==> | a                                  |
==> +------------------------------------+
==> | Node[0]{name:"Emil",from:"Sweden"} |
==> +------------------------------------+
==> 1 row
==> 38 ms

然后再去节点2或者节点3上查询,

neo4j-sh (?)$ START a = node(0) RETURN a;
==> +------------------------------------+
==> | a                                  |
==> +------------------------------------+
==> | Node[0]{name:"Emil",from:"Sweden"} |
==> +------------------------------------+
==> 1 row
==> 339 ms
neo4j-sh (?)$

发现,数据已经由节点1同步到节点3,说明,HA已经在正常的工作了。