1. 程式人生 > 程式設計 >Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

非關係型圖資料庫Neo4j簡介

Neo4j是現今最火爆的圖資料。在2010年釋出,產品的發展勢頭還算不錯。

作為圖資料庫,Neo4j最大的特點是關係資料的儲存。

圖資料庫除了能夠像普通的資料庫一樣儲存一行一行的資料之外,還可以很方便的看出儲存資料之間的關係資訊。

適合儲存”修改較少,查詢較多,沒有超大節點“的圖資料。

圖資料庫Neo4j應用場景

社交網路

根據使用者與其他使用者的關係為使用者推薦新的朋友。例如,在QQ中給你推薦朋友的朋友 。

智慧推薦引擎

通過分析使用者有哪些朋友、使用者朋友喜好的產品、使用者的瀏覽記錄等關係資訊推測使用者的喜好進而為使用者推薦商品。

知識圖譜

根據知識點間的關係建立圖譜,幫助使用者搜尋到關聯的知識。例如在百度上搜索Neo4j,會同時出現MySQL等類似的內容。

惡意軟體檢測

通過記錄軟體行為的各種關係資料,例如其訪問了哪些IP、訪問了哪些系統資源,進而分析軟體行為是否具有惡意。

網路、資料中心管理

網路、資料中心這些基礎設施自身就是一個包含複雜關係的網路,利用Neo4j可以方便的建立裝置之間的關係,以便於對整個系統的管理。

Neo4j優點

  • 資料的插入,查詢操作很直觀,不用再像之前要考慮各個表之間的關係。
  • 提供的圖搜尋和圖遍歷方法很方便,速度也是比較快的。

Neo4j缺點

  • 最不能讓人忍受的就是極慢的插入速度。可能是因為建立節點和邊的時候需要儲存一些額外資訊(為了查詢服務)。不知道是不是我程式碼的問題,插入10000個節點,10000條邊花了將近10分鐘…
  • 超大節點。當有一個節點的邊非常多時(常見於大V),有關這個節點的操作的速度將大大下降。這個問題很早就有了,官方也說過會處理,然而現在仍然不能讓人滿意。
  • 提高資料庫速度的常用方法就是多分配記憶體,然而看了官方操作手冊,貌似無法直接設定資料庫記憶體佔用量,而是需要計算後為其”預留“記憶體…

CentOS安裝啟動Neo4j

下載Neo4j

下載地址:https://neo4j.com/download-center/#community

包地址:https://neo4j.com/artifact.php?name=neo4j-community-3.5.6-unix.tar.gz

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

下載 3.5.6 版本

curl -O https://neo4j.com/artifact.php?name=neo4j-community-3.5.6-unix.tar.gz

安裝Neo4j

tar -zxvf neo4j-community-3.5.6-unix.tar.gz

移動資料夾

mv neo4j-community-3.5.6/ /usr/local/neo4j

效果

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

修改Neo4j配置檔案

配置檔案路徑

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

1、修改第22行load csv時路徑,在前面加個#註釋掉,可從任意路徑讀取檔案

#dbms.directories.import=import

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

2、修改35行和36行,去除註釋,設定JVM初始堆記憶體和JVM最大堆記憶體

(理論上JVM最大 堆記憶體越大越好,但是要小於機器的實體記憶體)

dbms.memory.heap.initial_size=512m

dbms.memory.heap.max_size=1g

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

如果不知道還剩多少,可以用linux命令free -m

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

3、修改46行,可以認為這個是快取,如果機器配置高,這個越大越好

dbms.memory.pagecache.size=5g

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

4、修改54行,去掉改行的#,可以遠端通過ip訪問neo4j資料庫

dbms.connectors.default_listen_address=0.0.0.

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

5、預設 bolt埠是7687,http埠是7474,https關口是7473,不修改下面3項也可以

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

dbms.connector.bolt.listen_address=:

dbms.connector.http.listen_address=:

dbms.connector.https.listen_address=:

去掉註釋

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

6、修改245行,去掉#,允許從遠端url來load csv

dbms.security.allow_csv_import_from_file_urls=true

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

7、修改265行,去除註釋設定neo4j可讀可寫

dbms.read_only=false

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

8、3.5.6 版本配置檔案(注:各個版本中配置檔案是不同的)

#*****************************************************************
# Neo4j configuration
#
# For more details and a complete list of settings,please see
# https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/
#*****************************************************************

# The name of the database to mount
#dbms.active_database=graph.db

# Paths of directories in the installation.
#dbms.directories.data=data
#dbms.directories.plugins=plugins
#dbms.directories.certificates=certificates
#dbms.directories.logs=logs
#dbms.directories.lib=lib
#dbms.directories.run=run

# This setting constrains all `LOAD CSV` import files to be under the `import` directory. Remove or comment it out to
# allow files to be loaded from anywhere in the filesystem; this introduces possible security problems. See the
# `LOAD CSV` section of the manual for details.
# dbms.directories.import=import

# Whether requests to Neo4j are authenticated.
# To disable authentication,uncomment this line
#dbms.security.auth_enabled=false

# Enable this to be able to upgrade a store from an older version.
#dbms.allow_upgrade=true

# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size.
dbms.memory.heap.initial_size=512m
dbms.memory.heap.max_size=1g

# The amount of memory to use for mapping the store files,in bytes (or
# kilobytes with the 'k' suffix,megabytes with 'm' and gigabytes with 'g').
# If Neo4j is running on a dedicated server,then it is generally recommended
# to leave about 2-4 gigabytes for the operating system,give the JVM enough
# heap to hold all your transaction state and query context,and then leave the
# rest for the page cache.
# The default page cache memory assumes the machine is dedicated to running
# Neo4j,and is heuristically set to 50% of RAM minus the max Java heap size.
dbms.memory.pagecache.size=5g

#*****************************************************************
# Network connector configuration
#*****************************************************************

# With default configuration Neo4j only accepts local connections.
# To accept non-local connections,uncomment this line:
dbms.connectors.default_listen_address=0.0.0.0

# You can also choose a specific network interface,and configure a non-default
# port for each connector,by setting their individual listen_address.

# The address at which this server can be reached by its clients. This may be the server's IP address or DNS name,or
# it may be the address of a reverse proxy which sits in front of the server. This setting may be overridden for
# individual connectors below.
#dbms.connectors.default_advertised_address=localhost

# You can also choose a specific advertised hostname or IP address,and
# configure an advertised port for each connector,by setting their
# individual advertised_address.

# Bolt connector
dbms.connector.bolt.enabled=true
#dbms.connector.bolt.tls_level=OPTIONAL
dbms.connector.bolt.listen_address=:7687

# HTTP Connector. There can be zero or one HTTP connectors.
dbms.connector.http.enabled=true
dbms.connector.http.listen_address=:7474

# HTTPS Connector. There can be zero or one HTTPS connectors.
dbms.connector.https.enabled=true
dbms.connector.https.listen_address=:7473

# Number of Neo4j worker threads.
#dbms.threads.worker_count=

#*****************************************************************
# SSL system configuration
#*****************************************************************

# Names of the SSL policies to be used for the respective components.

# The legacy policy is a special policy which is not defined in
# the policy configuration section,but rather derives from
# dbms.directories.certificates and associated files
# (by default: neo4j.key and neo4j.cert). Its use will be deprecated.

# The policies to be used for connectors.
#
# N.B: Note that a connector must be configured to support/require
#   SSL/TLS for the policy to actually be utilized.
#
# see: dbms.connector.*.tls_level

#bolt.ssl_policy=legacy
#https.ssl_policy=legacy

#*****************************************************************
# SSL policy configuration
#*****************************************************************

# Each policy is configured under a separate namespace,e.g.
#  dbms.ssl.policy.<policyname>.*
#
# The example settings below are for a new policy named 'default'.

# The base directory for cryptographic objects. Each policy will by
# default look for its associated objects (keys,certificates,...)
# under the base directory.
#
# Every such setting can be overridden using a full path to
# the respective object,but every policy will by default look
# for cryptographic objects in its base location.
#
# Mandatory setting

#dbms.ssl.policy.default.base_directory=certificates/default

# Allows the generation of a fresh private key and a self-signed
# certificate if none are found in the expected locations. It is
# recommended to turn this off again after keys have been generated.
#
# Keys should in general be generated and distributed offline
# by a trusted certificate authority (CA) and not by utilizing
# this mode.

#dbms.ssl.policy.default.allow_key_generation=false

# Enabling this makes it so that this policy ignores the contents
# of the trusted_dir and simply resorts to trusting everything.
#
# Use of this mode is discouraged. It would offer encryption but no security.

#dbms.ssl.policy.default.trust_all=false

# The private key for the default SSL policy. By default a file
# named private.key is expected under the base directory of the policy.
# It is mandatory that a key can be found or generated.

#dbms.ssl.policy.default.private_key=

# The private key for the default SSL policy. By default a file
# named public.crt is expected under the base directory of the policy.
# It is mandatory that a certificate can be found or generated.

#dbms.ssl.policy.default.public_certificate=

# The certificates of trusted parties. By default a directory named
# 'trusted' is expected under the base directory of the policy. It is
# mandatory to create the directory so that it exists,because it cannot
# be auto-created (for security purposes).
#
# To enforce client authentication client_auth must be set to 'require'!

#dbms.ssl.policy.default.trusted_dir=

# Client authentication setting. Values: none,optional,require
# The default is to require client authentication.
#
# Servers are always authenticated unless explicitly overridden
# using the trust_all setting. In a mutual authentication setup this
# should be kept at the default of require and trusted certificates
# must be installed in the trusted_dir.

#dbms.ssl.policy.default.client_auth=require

# It is possible to verify the hostname that the client uses
# to connect to the remote server. In order for this to work,the server public
# certificate must have a valid CN and/or matching Subject Alternative Names.

# Note that this is irrelevant on host side connections (sockets receiving
# connections).

# To enable hostname verification client side on nodes,set this to true.

#dbms.ssl.policy.default.verify_hostname=false

# A comma-separated list of allowed TLS versions.
# By default only TLSv1.2 is allowed.

#dbms.ssl.policy.default.tls_versions=

# A comma-separated list of allowed ciphers.
# The default ciphers are the defaults of the JVM platform.

#dbms.ssl.policy.default.ciphers=

#*****************************************************************
# Logging configuration
#*****************************************************************

# To enable HTTP logging,uncomment this line
#dbms.logs.http.enabled=true

# Number of HTTP logs to keep.
#dbms.logs.http.rotation.keep_number=5

# Size of each HTTP log that is kept.
#dbms.logs.http.rotation.size=20m

# To enable GC Logging,uncomment this line
#dbms.logs.gc.enabled=true

# GC Logging Options
# see http://docs.oracle.com/cd/E19957-01/819-0084-10/pt_tuningjava.html#wp57013 for more information.
#dbms.logs.gc.options=-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -XX:+PrintTenuringDistribution

# For Java 9 and newer GC Logging Options
# see https://docs.oracle.com/javase/10/tools/java.htm#JSWOR-GUID-BE93ABDC-999C-4CB5-A88B-1994AAAC74D5
#dbms.logs.gc.options=-Xlog:gc*,safepoint,age*=trace

# Number of GC logs to keep.
#dbms.logs.gc.rotation.keep_number=5

# Size of each GC log that is kept.
#dbms.logs.gc.rotation.size=20m

# Log level for the debug log. One of DEBUG,INFO,WARN and ERROR. Be aware that logging at DEBUG level can be very verbose.
#dbms.logs.debug.level=INFO

# Size threshold for rotation of the debug log. If set to zero then no rotation will occur. Accepts a binary suffix "k",# "m" or "g".
#dbms.logs.debug.rotation.size=20m

# Maximum number of history files for the internal log.
#dbms.logs.debug.rotation.keep_number=7

#*****************************************************************
# Miscellaneous configuration
#*****************************************************************

# Enable this to specify a parser other than the default one.
#cypher.default_language_version=3.0

# Determines if Cypher will allow using file URLs when loading data using
# `LOAD CSV`. Setting this value to `false` will cause Neo4j to fail `LOAD CSV`
# clauses that load data from the file system.
dbms.security.allow_csv_import_from_file_urls=true


# Value of the Access-Control-Allow-Origin header sent over any HTTP or HTTPS
# connector. This defaults to '*',which allows broadest compatibility. Note
# that any URI provided here limits HTTP/HTTPS access to that URI only.
#dbms.security.http_access_control_allow_origin=*

# Value of the HTTP Strict-Transport-Security (HSTS) response header. This header
# tells browsers that a webpage should only be accessed using HTTPS instead of HTTP.
# It is attached to every HTTPS response. Setting is not set by default so
# 'Strict-Transport-Security' header is not sent. Value is expected to contain
# directives like 'max-age','includeSubDomains' and 'preload'.
#dbms.security.http_strict_transport_security=

# Retention policy for transaction logs needed to perform recovery and backups.
dbms.tx_log.rotation.retention_policy=1 days

# Only allow read operations from this Neo4j instance. This mode still requires
# write access to the directory for lock purposes.
dbms.read_only=false

# Comma separated list of JAX-RS packages containing JAX-RS resources,one
# package name for each mountpoint. The listed package names will be loaded
# under the mountpoints specified. Uncomment this line to mount the
# org.neo4j.examples.server.unmanaged.HelloWorldResource.java from
# neo4j-server-examples under /examples/unmanaged,resulting in a final URL of
# http://localhost:7474/examples/unmanaged/helloworld/{nodeId}
#dbms.unmanaged_extension_classes=org.neo4j.examples.server.unmanaged=/examples/unmanaged

# A comma separated list of procedures and user defined functions that are allowed
# full access to the database through unsupported/insecure internal APIs.
#dbms.security.procedures.unrestricted=my.extensions.example,my.procedures.*

# A comma separated list of procedures to be loaded by default.
# Leaving this unconfigured will load all procedures found.
#dbms.security.procedures.whitelist=apoc.coll.*,apoc.load.*

#********************************************************************
# JVM Parameters
#********************************************************************

# G1GC generally strikes a good balance between throughput and tail
# latency,without too much tuning.
dbms.jvm.additional=-XX:+UseG1GC

# Have common exceptions keep producing stack traces,so they can be
# debugged regardless of how often logs are rotated.
dbms.jvm.additional=-XX:-OmitStackTraceInFastThrow

# Make sure that `initmemory` is not only allocated,but committed to
# the process,before starting the database. This reduces memory
# fragmentation,increasing the effectiveness of transparent huge
# pages. It also reduces the possibility of seeing performance drop
# due to heap-growing GC events,where a decrease in available page
# cache leads to an increase in mean IO response time.
# Try reducing the heap memory,if this flag degrades performance.
dbms.jvm.additional=-XX:+AlwaysPreTouch

# Trust that non-static final fields are really final.
# This allows more optimizations and improves overall performance.
# NOTE: Disable this if you use embedded mode,or have extensions or dependencies that may use reflection or
# serialization to change the value of final fields!
dbms.jvm.additional=-XX:+UnlockExperimentalVMOptions
dbms.jvm.additional=-XX:+TrustFinalNonStaticFields

# Disable explicit garbage collection,which is occasionally invoked by the JDK itself.
dbms.jvm.additional=-XX:+DisableExplicitGC

# Remote JMX monitoring,uncomment and adjust the following lines as needed. Absolute paths to jmx.access and
# jmx.password files are required.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details,see: http://download.oracle.com/javase/8/docs/technotes/guides/management/agent.html
# On Unix based systems the jmx.password file needs to be owned by the user that will run the server,# and have permissions set to 0600.
# For details on setting these file permissions on Windows see:
#   http://docs.oracle.com/javase/8/docs/technotes/guides/management/security-windows.html
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.port=3637
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.authenticate=true
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.ssl=false
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.password.file=/absolute/path/to/conf/jmx.password
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.access.file=/absolute/path/to/conf/jmx.access

# Some systems cannot discover host name automatically,and need this line configured:
#dbms.jvm.additional=-Djava.rmi.server.hostname=$THE_NEO4J_SERVER_HOSTNAME

# Expand Diffie Hellman (DH) key size from default 1024 to 2048 for DH-RSA cipher suites used in server TLS handshakes.
# This is to protect the server from any potential passive eavesdropping.
dbms.jvm.additional=-Djdk.tls.ephemeralDHKeySize=2048

# This mitigates a DDoS vector.
dbms.jvm.additional=-Djdk.tls.rejectClientInitiatedRenegotiation=true

#********************************************************************
# Wrapper Windows NT/2000/XP Service Properties
#********************************************************************
# WARNING - Do not modify any of these properties when an application
# using this configuration file has been installed as a service.
# Please uninstall the service before modifying this section. The
# service can then be reinstalled.

# Name of the service
dbms.windows_service_name=neo4j

#********************************************************************
# Other Neo4j system properties
#********************************************************************
dbms.jvm.additional=-Dunsupported.dbms.udc.source=tarball

檢視Neo4j是否啟動

啟動:進入bin目錄執行./neo4j start

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

停止:進入bin目錄執行./neo4j stop

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

檢視狀態:進入bin目錄執行./neo4j status

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

web訪問Neo4j

http://伺服器ip:7474/browser/

在瀏覽器訪問圖資料庫所在的機器上的7474埠(第一次訪問賬號neo4j,密碼neo4j,會提示修改初始密碼)

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

設定完密碼後,點選左上角資料庫,就能看到圖資料庫裡面的資訊了

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

Python3操作Neo4j

安裝py2neo模組

pip install py2neo

如果安不上,請用:

pip install git+https://github.com/nigelsmall/py2neo.git

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

官網地址:https://py2neo.org/v3/index.html

更多內容請參考官網給的命令:

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

效果圖

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

Python3開發例項之非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項

簡單講解

如上圖,是本示例的效果。

其中,我加了5個節點資訊,3種關係(7個分支的關係),還有3種屬性。

這裡是給了節點加了屬性,例如我給自己加了“部落格地址”的屬性,屬性值為“https://www.jb51.net/”。

還可以給關係加屬性,這裡沒做展示,方法是類似的。

完整原始碼

from py2neo import Graph,Node,Relationship

graph = Graph(host='IP地址',http_port=7474,user='neo4j',password='123456')

# 清空庫
graph.delete_all()

# 建立結點
test_node_0 = Node('西遊記',name='唐僧') # 修改的部分
test_node_1 = Node('西遊記',name='孫悟空') # 修改的部分
test_node_2 = Node('西遊記',name='豬八戒') # 修改的部分
test_node_3 = Node('西遊記',name='沙師弟') # 修改的部分
test_node_4 = Node('西遊記',name='白龍馬') # 修改的部分

test_node_3.setdefault("部落格地址",'https://shazhenyu.blog.csdn.net/')

graph.create(test_node_0)
graph.create(test_node_1)
graph.create(test_node_2)
graph.create(test_node_3)
graph.create(test_node_4)

# 建立關係
# 分別建立了test_node_1指向test_node_2和test_node_2指向test_node_1兩條關係,關係的型別為"丈夫、妻子",兩條關係都有屬性count,且值為1。
node_0_node_1 = Relationship(test_node_0,'師傅',test_node_1)
node_0_node_2 = Relationship(test_node_0,test_node_2)
node_0_node_3 = Relationship(test_node_0,test_node_3)
node_1_node_0 = Relationship(test_node_1,'徒弟',test_node_0)
node_2_node_0 = Relationship(test_node_2,test_node_0)
node_3_node_0 = Relationship(test_node_3,test_node_0)
node_4_node_0 = Relationship(test_node_4,'坐騎',test_node_0)
node_0_node_1['count'] = 1
node_4_node_0['count'] = 1

graph.create(node_0_node_1)
graph.create(node_0_node_2)
graph.create(node_0_node_3)
graph.create(node_1_node_0)
graph.create(node_2_node_0)
graph.create(node_3_node_0)
graph.create(node_4_node_0)

print(graph)
print(test_node_0)
print(test_node_1)
print(test_node_2)
print(test_node_3)
print(test_node_4)
print(node_0_node_1)
print(node_0_node_2)
print(node_0_node_3)
print(node_1_node_0)
print(node_2_node_0)
print(node_3_node_0)
print(node_4_node_0)

本文詳細講解了非關係型圖資料庫Neo4j安裝方法及Python3連線操作Neo4j方法例項,更多關於Python3操作Neo4j的知識請檢視下面的相關連結