Neo4j from Java - Neo4j Essentials (2015)

Neo4j Essentials (2015)

Chapter 5. Neo4j from Java

One of the most important aspects of any software/application/database is its ability to integrate with the existing applications and the deployment models / options exposed and provided to the users.

Organizations in today's world already have some sort of software deployed and used by their internal or external users for managing their day-to-day operations and storing the data. Now, planning to deploy a new software/application would definitely bring up the question of flexibility provided by the new software and how well it can be integrated with the existing applications.

Here are a few such questions that organizations want to answer for any new software that is proposed to be deployed:

· Will I be able to integrate the solution with our home-grown applications? Does it support any out-of-the-box integrations?

· Do we want to make a smaller investment up-front or are we looking for an incremental return over the long - term? Do we need costly, dedicated, and high-end infrastructure now or in the near future?

· What about my organization's security concerns?

· Will my deployment choice support a growing business over time?

Neo4j supports different ways to build and deploy Graph Database and the fact that it is written in Java and Scala have exposed various options for testing, deployment, and configuration.

In this chapter, you will learn how to integrate Neo4j with the existing Java applications and it will cover the following points:

· Embedded versus REST

· Unit testing in Neo4j

· Java APIs

· Graph traversals

Embedded versus REST

Neo4j provides more than one deployment and configuration models to integrate with the existing applications. It exposes all the required Java APIs for deploying the server as an embedded application or a standalone / REST-based server.

Although both deployment models have their own pros and cons, the decision for production deployment model would depend upon the vision and the usage of the software.

Let's first understand each of the deployment models and then we will discuss the advantages/disadvantages and applicability of each of the models.

Embedding Neo4j in Java applications

Embedding Neo4j in a Java application should not be confused with the in-memory database. However, Neo4j exposes all the required APIs to deploy the server as an embedded application or configure it as in-memory database.

Execute the following steps to configure and run Neo4j as an embedded server within your existing applications:

1. Choose the appropriate Neo4j Edition—Community or Enterprise (refer to the Licensing options section in Chapter 1, Installation and the First Query).

2. You can either manually configure your project or use Maven to configure it. Here we will define the steps to configure Maven-based Java projects; but later in this section, we will also talk about the process to set up Java projects manually using anIntegrated Development Environment (IDE) such as Eclipse—https://eclipse.org/, Intellij IDEA—https://www.jetbrains.com/idea/, NetBeans—https://netbeans.org/, and so on. Perform the following steps to create a Maven project:

1. Download Maven 3.2.3 from http://maven.apache.org/download.cgi.

2. Once the archive is downloaded, browse the directory on your filesystem and extract it.

3. Define the system environment variable M2_HOME=<location of extracted Archive file >.

4. Add $M2_HOME/bin/ to your $PATH variable.

5. Next, open your console and browse the location where you want to create your new Maven project and execute:

6. mvn archetype:generate -DgroupId=neo4j.embedded.myserver -DartifactId=MyNeo4jSamples -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

7. It will take some time as Maven will download the required dependencies to create your project. You will see a MyNeo4jSamples directory being created as soon as the project is created.

8. Edit the MyNeo4jSamples/pom.xml file and add the following piece of code under the dependencies section for defining the dependencies of the Neo4j libraries:

9. <dependencies>

10. <dependency>

11. <groupId>org.neo4j</groupId>

12. <artifactId>neo4j</artifactId>

13. <version>2.1.5</version>

14. </dependency>

</dependencies>

3. Next, create a Java class by the name neo4j.embedded.myserver.EmbeddedServer.java and add following piece of code:

4. public class EmbeddedServer {

5.

6. private static void registerShutdownHook(final GraphDatabaseService graphDb) {

7.

8. // Registers a shutdown hook for the Neo4j.

9. Runtime.getRuntime().addShutdownHook(new Thread() {

10. public void run() {

11. //Shutdown the Database

12. System.out.println("Server is shutting down");

13. graphDb.shutdown();

14. }});

15.}

16. public static void main(String[] args) {

17. //Create a new Object of Graph Database

18. GraphDatabaseService graphDb = new GraphDatabaseFactory().newEmbeddedDatabase("Location of Storing Neo4j Database Files");

19. System.out.println("Server is up and Running");

20. //Register a Shutdown Hook

21. registerShutdownHook(graphDb);

22. }

}

And we are done!!! Now we can use the instance of graph database by the name graphDb for creating nodes/relationships and properties.

To run the preceding code, perform the following steps:

1. Import the following Java packages in your EmbeddedServer.java program:

· org.neo4j.graphdb.GraphDatabaseService;

· org.neo4j.graphdb.factory.GraphDatabaseFactory;

2. Open the console and browse the location where you have created your Maven project.

3. Now, execute following command to compile and create a build:

4. mvn clean compile install

5. Next, to run your embedded server, execute the following Maven command in the console:

6. mvn exec:java -Dexec.mainClass="neo4j.embedded.myserver.EmbeddedServer" -Dexec.cleanupDaemonThreads=false

Let's understand the different sections of the preceding code:

GraphDatabaseService graphDb = new GraphDatabaseFactory()

.newEmbeddedDatabase("Location of Storing Neo4j Database Files");

The preceding piece of code initializes the Neo4j database and stores the database files in the given directory, which should be the path of the local filesystem.

//Register a Shutdown Hook

registerShutdownHook(graphDb);

The preceding statement defines a function that registers a shutdown hook for the Neo4j database so that it shuts down nicely when the VM exits. It also ensures a clean shutdown even if you execute Ctrl + C on the running application.

We can also configure the various database configurations using the functions provided in GraphDatabaseBuilder. For example, the database creation code can be rewritten as follows:

GraphDatabaseFactory graphFactory = new GraphDatabaseFactory();

GraphDatabaseBuilder graphBuilder = graphFactory.newEmbeddedDatabaseBuilder("Location of Storing Neo4j Database Files");

graphBuilder.loadPropertiesFromFile("location of file containing the Neo4j database properties");

GraphDatabaseService graphDb = graphBuilder.newGraphDatabase();

Note

We can also configure the Neo4j database properties using the GraphDatabaseBuilder.setConfig(…) method. For example, graphBuilder.setConfig(GraphDatabaseSettings.dump_configuration,"true");. For the available configuration parameters, refer to the attributes defined in org.neo4j.graphdb.factory.GraphDatabaseSettings.java at http://neo4j.com/docs/2.1.5/javadocs/org/neo4j/graphdb/factory/GraphDatabaseSettings.html.

Before moving to the next section, let's also discuss the process of manually configuring the Java project. Perform the following steps to manually configure your Java project:

1. Download and open your favorite IDE (Eclipse, IntelliJ, NetBeans, and so on), create a new Java project, and name it as MyNeo4jSamples.

2. Add all the JAR files from $NEO4J_HOME>/lib as the dependency to your new Java project.

3. Add a new package in your Java project by the name neo4j.embedded.myserver.

4. Next, create a Java class neo4j.embedded.myserver.EmbeddedServer.java and add the same code that we discussed in step 3 for Maven-based projects.

5. Next, directly execute code from your IDE and the results will be the same as you see with your Maven-based Java projects.

Note

Using appropriate plugins, you can create and execute Maven projects from the IDE itself. For example, in Eclipse, you can install and use M2E plugins. For more information, refer to http://eclipse.org/m2e/.

Neo4j as a REST-based application

Neo4j exposes various REST endpoints for performing CRUD and search operations over the Neo4j database. In order to work with these REST endpoints, we need to have a running Neo4j server and develop a REST client that can invoke these different services.

Let's understand and perform the following steps to deploy the Neo4j server and develop the REST client for invoking REST APIs exposed by the Neo4j server:

1. Open your console and execute $NEO4J_HOME/bin/neo4j. Wait for successful execution of the command and open your browser and type http://localhost:7474/. You will see the default Neo4j browser.

Note

In case something goes wrong, then check for errors in $NEO4J_HOME/data/messages.log.

2. Let's extend our MyNeo4jSamples project, which we created in the previous section. Create Java REST clients, and then invoke the REST APIs exposed by the Neo4j server.

3. Open the MyNeo4jSamples/pom.xml file and add the following dependencies under the <dependencies> section:

4. <dependency>

5. <groupId>com.sun.jersey</groupId>

6. <artifactId>jersey-client</artifactId>

7. <version>1.8</version>

8. </dependency>

9. <dependency>

10. <groupId>org.json</groupId>

11. <artifactId>json</artifactId>

12. <version>20140107</version>

</dependency>

13. Add a new package called neo4j.rest.client and a Java class MyRestClient.java, and import the following packages:

14.import org.json.JSONObject;

15.import javax.ws.rs.core.MediaType;

16.import com.sun.jersey.api.client.Client;

17.import com.sun.jersey.api.client.ClientResponse;

import com.sun.jersey.api.client.WebResource;

18. Next we will define a new method in MyRestClient.java for adding a new node:

19.public class MyRestClient {

20. public void createNode(){

21. //Create a REST Client

22. Client client = Client.create();

23.

24. //Define a resource (REST Endpoint) which needs to be Invoked for creating a Node

25. WebResource resource = client.resource("http://localhost:7474").path("/db/data/node");

26. //Define properties for the node.

27. JSONObject node = new JSONObject();

28. node.append("Name", "John");

29.

30. //Invoke the rest endpoint as JSON request

31. ClientResponse res = resource.accept(MediaType.APPLICATION_JSON)

32. .entity(node.toString())

33. .post(ClientResponse.class);

34. //Print the URI of the new Node

35. System.out.println("URI of New Node = " + res.getLocation());

36. }

}

37. Next, we will define another method in MyRestClient.java, just above the closing braces of the class (}), and add the following code to query the database using Cypher:

38.public void sendCypher() {

39. //Create a REST Client

40. Client client = Client.create();

41. //Define a resource (REST Endpoint) which needs to be Invoked

42. //for executing Cypher Query

43. WebResource resource = client.resource("http://localhost:7474").path("/db/data/cypher");

44. //Define JSON Object and Cypher Query

45. JSONObject cypher = new JSONObject();

46. cypher.accumulate("query", "match n return n limit 10");

47.

48. //Invoke the rest endpoint as JSON request

49. ClientResponse res = resource.accept(MediaType.APPLICATION_JSON)

50. .entity(cypher.toString())

51. .post(ClientResponse.class);

52. //Print the response received from the Server

53. System.out.println(res.getEntity(String.class));

}

54. Now create an object of MyRestClient in your main method and invoke the createNode() and sendCypher() methods one by one.

55. Now execute the following command to compile and create a build:

56.$M2_HOME/bin/mvn clean compile install

57. To run your MyRestClient, execute the following Maven command in the console:

58.$M2_HOME/bin/mvn exec:java -Dexec.mainClass="neo4j.rest.client.MyRestClient" -Dexec.cleanupDaemonThreads=false.

Both the methods createNode() and sendCypher() expose different endpoints for handling operations on nodes and query data with Cypher. The createNode() method will print the URI of the newly created node along with its ID /db/data/node/{Node_ID} and sendCypher()will print the results of the Cypher query on the console in the JSON format.

Similarly, there are other endpoints exposed by the server for performing operations on properties, relationships, labels, and so on.

Note

For a complete list of REST endpoints, use your rest client and execute the URL http://localhost:7474/db/data. For more details, refer to http://neo4j.com/docs/stable/rest-api.html.

Which is best?

It depends!!! Yes, that's true, it all depends.

Both the options have their own advantages and the choice/applicability of the deployment model will largely depend upon your vision, use case, and the environment. Let's discuss each of the options, their advantages and disadvantages, and then we will talk about the best deployment model.

Let's start describing few of the advantages of deploying Neo4j as a REST-based server:

· It enables multiple applications (deployed in separate contexts) to perform CRUD operations over a single Neo4j database

· It provides scalability and High Availability (with the Enterprise Edition)

· It provides more than one language or client for performing CRUD operations

· It provides an administrator console and a browser to browse data

Wow!!! It's nice but that's not all. Let's discuss some of the disadvantages of the REST-based server:

· It takes a considerable amount of time to develop an application. Lots of code needs to be written for developing small-to-medium-sized applications.

· It does not provide performance (compared to embedded), as the request is over HTTP.

We have not concluded yet. Let's now talk about the embedded server, which has the following advantages:

· It is robust and well-defined and well-documented Java APIs are available for performing CRUD and traversals over the graph database

· It is easy to manage as it is only a single process that needs to be monitored and managed

· It provides maximum performance, as calls are local and not remote

It's nice too, but it also has its downside. Let's discuss some of the disadvantages of the embedded server:

· Neo4j database cannot be shared or used by multiple clients, which limits the scalability

· There is no admin console available

· It only supports JVM-based languages (such as Java or Scala)

Now, let's move ahead and discuss scenarios where we can think of leveraging either of the deployment models.

Scalability and High Availability are one of the critical non-functional requirements (NFRs) for most of the enterprise systems and that is sufficient reason/argument to deploy Neo4j as a REST-based server and then developing REST clients to perform CRUD and search operations. Also this is one of the most common forms of deployment models for the production environments as it provides scalable and highly available database systems for your enterprise applications.

But there are applications where quick and faster development is a priority or you do not want to have dedicated hardware or a separate process to be deployed, which may be too costly due to many obvious reasons, and in all those cases, Neo4j as an embedded server would be the best choice.

Considering the advantages/disadvantages and the scenarios, we would conclude that there is no single deployment model which works the best or is applicable for all scenarios. So based on the needs of your use case, decide upon the appropriate deployment model. It is good to have the requirements and vision of your application drive the selection of deployment models and not the other way round.

Unit testing in Neo4j

Unit testing is an important aspect of any development lifecycle. It not only involves the testing of the expected outcome, but at the same time, it should also test the unexpected conditions/scenarios and the behavior of the system.

In this fast-paced development environment where delivery models such as Agile—http://en.wikipedia.org/wiki/Agile_software_development—or Extreme Programming—http://en.wikipedia.org/wiki/Extreme_programming—focus on delivering the enhancements/features in the shortest possible time, a good set of unit test suites helps developers to ensure that adding new code does not break the integrity of the system, no matter how far the developers are familiar/aware of the rest of the code base. The goal of unit testing is to isolate each part of the program and show that the individual parts are correct, which is also used as a regression test suite later in the development cycle.

Even in the development models such as Test-driven Development (TDD)—http://en.wikipedia.org/wiki/Test-driven_development, the developer first writes an (initially failing) automated test case that defines a desired improvement or new function, then produces the minimum amount of code to pass that test, and finally, refactors the new code to acceptable standards.

There are several benefits of unit testing, which we will not discuss but it would be worth mentioning a few of them:

· Problem solving: Identifying and solving problems early in the development cycle

· Facilitates change: Unit testing allows the programmer to refactor code at a later date

· Simplifies integration: It reduces uncertainty in the units themselves

· Separation of concerns: Interfaces are separated from implementation where each method is tested as a "unit" and the rest of the dependencies are injected by some other means such as mocks

Neo4j data models, as we discussed in Chapter 3, Pattern Matching in Neo4j, are evolving and developed over the period of time, so the code that supports these evolving models is also highly agile and is constantly changing. In these scenarios, unit testing becomes more important and every piece of code, either added or modified, should ensure that it does not break the other pieces of the system.

JUnit (provided by http://junit.org) is one of the most popular frameworks for writing Java-based unit test cases.

Let's perform the following steps to integrate JUnit framework, and enhance our Maven project, MyNeo4jSamples, and then add some basic unit test cases:

1. Open your MyNeo4jSamples/pom.xml file and add the following properties just below the <packaging> tag:

2. <properties>

3. <java.version>1.7</java.version>

4. <maven.compiler.source>1.7</maven.compiler.source>

5. <maven.compiler.target>1.7</maven.compiler.target>

</properties>

6. Add the following dependencies in the <dependency> section:

7. <dependency>

8. <groupId>junit</groupId>

9. <artifactId>junit</artifactId>

10. <version>4.11</version>

11. <scope>test</scope>

12.</dependency>

13.<dependency>

14. <groupId>org.hamcrest</groupId>

15. <artifactId>hamcrest-all</artifactId>

16. <version>1.3</version>

17. <scope>test</scope>

18.</dependency>

19.<dependency>

20. <groupId>org.neo4j</groupId>

21. <artifactId>neo4j-kernel</artifactId>

22. <version>2.1.5</version>

23. <type>test-jar</type>

24. <scope>test</scope>

</dependency>

Note

In case any of these dependencies already exist in your pom.xml file, then just update the version of the package defined in <version> tag.

25. Create a package neo4j.tests and a Java class Neo4jTest.java under src/test/java and add following code:

26.package neo4j.tests;

27.

28.import org.junit.After;

29.import org.junit.Before;

30.import org.neo4j.graphdb.GraphDatabaseService;

31.import org.neo4j.test.TestGraphDatabaseFactory;

32.

33.public class Neo4jTest {

34.

35. private GraphDatabaseService graphDb;

36.

37. @Before

38. public void createTestDatabase() throws Exception{

39. try {

40.

41. System.out.println("Set up Invoked");

42. graphDb = new TestGraphDatabaseFactory().newImpermanentDatabase();

43.

44. } catch (Exception e) {

45. System.err.println("Cannot setup the Test Database....check Logs for Stack Traces");

46. throw e;

47. }

48.

49. }

50. @After

51. public void cleanupTestDatabase() throws Exception {

52. try {

53. System.out.println("Clean up Invoked");

54. graphDb.shutdown();

55. } catch (Exception e) {

56. System.err.println("Cannot cleanup Test Database....check Logs for Stack Traces");

57. throw e;

58. }

59. }

}

In the preceding code, we have defined two methods createTestDatabase and cleanupTestDatabase, where the former creates a test database and the latter shuts down the database. These methods are automatically invoked by the JUnit framework and the annotations @Before and @After help JUnit framework to identify these setup and cleanup methods. Also, these methods are invoked before and after execution of each JUnit test case.

Neo4j provides a test database factory, which creates a test database with minimum configuration and helps developers to focus on writing unit test cases, rather than setting up the database. Here is that piece of code, which creates a test database new TestGraphDatabaseFactory().newImpermanentDatabase();

60. Next, open your console and browse your project root, that is, MyNeo4jSamples and execute $M2_HOME/bin/mvn test for executing your test cases.

61. Your build should fail, because there are no unit tests to be executed. So now, let's add a unit test case by the name of nodeCreationWithLabel in Neo4jTest.java and add the following code:

62.@org.junit.Test

63.public void nodeCreationWithLabel() {

64. //Open a Transaction

65. try(Transaction transaction=graphDb.beginTx()){

66. String testLabel="TestNode";

67. //Create a Node with Label in the neo4j Database

68. graphDb.createNode(DynamicLabel.label(testLabel));

69. //Execute Cypher query and retrieve all Nodes from Neo4j Database

70. ExecutionEngine engine = new ExecutionEngine(graphDb, StringLogger.SYSTEM);

71. String query = "MATCH (n) return n";

72. ExecutionResult result = engine.execute(query);

73. Iterator <Object> objResult = result.columnAs("n");

74. //Check that Database has only 1 Node and not more than 1 node

75. Assert.assertTrue(objResult.size()==1);

76. while(objResult.hasNext()){

77. Node cypherNode = (Node)objResult.next();

78. //Check that Label matches with the same Label what was created initially

79. Assert.assertTrue(cypherNode.hasLabel(DynamicLabel.label(testLabel)));

80. }

81. transaction.success();

82. }

}

Let's understand the preceding JUnit test:

graphDb.createNode(DynamicLabel.label(testLabel));

The preceding line creates a node in the test database with the provided label.

//Execute Cypher query and retrieve all Nodes from Neo4j Database

ExecutionEngine engine = new ExecutionEngine(graphDb, StringLogger.SYSTEM);

String query = "MATCH (n) return n";

ExecutionResult result = engine.execute(query);

Iterator <Object> objResult = result.columnAs("n");

After creating node in the test database, the preceding code then executes the Cypher query to get the results from the same database.

Assert.assertTrue(objResult.size()==1);

Next, the preceding code checks whether the Cypher query has returned only one node. Not more and not less!

while(objResult.hasNext()){

Node cypherNode = (Node)objResult.next();

//Check that Label matches with the same Label what was created initially

Assert.assertTrue(cypherNode.hasLabel(DynamicLabel.label(testLabel)));

}

Further, we iterate through the results of the Cypher query and check whether the label of both the nodes is the same.

83. Now, let's run our JUnits once again and execute $M2_HOME/bin/mvn test from the root of your Java project, that is, MyNeo4jSamples, and this time you should see a successful build.

Easy and simple…isn't it?

Yes it is, but the only drawback is that there is a lot of boilerplate code such as iterating through the Cypher result set, which needs to be written for each and every JUnit.

Although we can create some helper methods, they also need to check everything such as nodes, labels, properties, paths, and many more conditions, which would be a time-consuming task.

Testing frameworks for Neo4j

In the previous section, we talked about Neo4j and the process of writing JUnit using the framework provided by http://junit.org.

We also discussed the shortcomings of JUnit because of the boilerplate code that needs to be written, even for a simple JUnit. It not only takes time but is also difficult to maintain.

To overcome all these shortcomings, frameworks such as GraphUnit—http://graphaware.com/—and AssertJ—http://joel-costigliola.github.io/assertj/assertj-neo4j.html—were evolved which provide various assertions to reduce unnecessary code (boilerplate code) and help developers to focus on writing effective and efficient unit tests.

Let's enhance our test suite, that is, neo4j.tests.Neo4jTest.java and see how we can leverage assertions provided by GraphUnit and AssertJ.

Perform the following steps to develop unit tests using GraphUnit:

1. Open your MyNeo4jSamples/pom.xml file and add the following dependencies within the <dependencies> section:

2. <dependency>

3. <groupId>com.graphaware.neo4j</groupId>

4. <artifactId>tests</artifactId>

5. <version>2.1.5.25</version>

6. <scope>test</scope>

</dependency>

7. Add another function by the name of compareNeo4jDatabase() and add the following code:

8. @org.junit.Test

9. public void compareNeo4jDatabase() {

10. //Open a Transaction

11. try(Transaction transaction=graphDb.beginTx()){

12. String testLabel="TestNode";

13. //Create a Node with Label in the neo4j Database

14. graphDb.createNode(DynamicLabel.label(testLabel));

15. transaction.success();

16. }

17.

18. GraphUnit.assertSameGraph(graphDb,"create (n:TestNode) return n");

19.

}

The first part of the preceding code, where we create a node in the test Neo4j database remains the same, but the latter part is replaced with a single assertion provided by GraphUnit, which checks whether the given graph is equivalent to the graph created by the provided cypher statement. GraphUnit also provides assertions for testing parts of graphs, that is, subgraphs. Refer to http://graphaware.com/site/framework/latest/apidocs/com/graphaware/test/unit/GraphUnit.html for available assertions with GraphUnit.

AssertJ also helps in a similar way. Perform the following steps to use assertions provided by AssertJ:

1. Open your MyNeo4jSamples/pom.xml file and add the following dependencies in the <dependencies> section:

2. <dependency>

3. <groupId>org.assertj</groupId>

4. <artifactId>assertj–neo4j</artifactId>

5. <version>1.0.0</version>

6. <scope>test</scope>

</dependency>

7. Add another function by the name of compareProperties() and add the following code:

8. @org.junit.Test

9. public void compareProperties () {

10. //Open a Transaction

11. try(Transaction transaction=graphDb.beginTx()){

12. String testLabel="TestNode";

13. String testKey="key";

14. String testValue="value";

15. //Create a Node with Label in the neo4j test Database

16. Node node = graphDb.createNode(DynamicLabel.label(testLabel));

17. node.setProperty(testKey, testValue);

18. transaction.success();

19. }

20.

21. //Check the Assertion

22. assertThat(node).hasLabel(DynamicLabel.label(testLabel)).hasProperty(testKey, testValue);

}

The preceding code first creates a node with properties in the test Neo4j database and next it uses the assertion provided by AssertJ to check whether the node is successfully created with the provided labels and properties.

The following assertions are provided by AssertJ:

· Node assertions

· Path assertions

· Relationship assertions

· PropertyContainer assertions

In this section, you have discussed and learned the importance of unit testing while working with Neo4j and the various frameworks available for unit testing the code developed for creating and traversing Neo4j graphs.

Let's move ahead to the next section where we will talk about Java APIs exposed and available in Neo4j.

Java APIs

Neo4j exposes a good number of Java APIs that are packaged and categorized based on their usages. Let's see the different categories that are available and how they are organized within Neo4j:

· Graph database: The APIs within this category contain various classes for dealing with the basic operations of the graph database such as creating database / nodes / labels, and so on. The following is list of Java packages available with this category:

Java Package

Description

org.neo4j.graphdb

This contains the core APIs to work with graphs such as node / property / label creation

org.neo4j.graphdb.config

This package contains the classes used for modifying or setting the database configurations

org.neo4j.graphdb.event

This package contains the classes used for handling various events such as transaction management or database / kernel events

org.neo4j.graphdb.factory

This contains the factory classes for creating the database

org.neo4j.graphdb.index

This package contains the integrated API for managing legacy indexes on nodes and relationships

org.neo4j.graphdb.schema

This is a new package introduced for creating and managing schema (indexes and constraints) on graphs

org.neo4j.graphdb.traversal

This is a callback-based traversal API for graphs, which provides a choice between traversing breadth- or depth-first

· Query language: This contains the classes for executing Cypher queries from Java code. It contains two packages: org.neo4j.cypher.export and org.neo4j.cypher.javacompat.

· Graph algorithms: This defines Java packages and classes for defining and invoking various graph algorithms. It contains only one package, org.neo4j.graphalgo.

· Management: As the name suggests, it contains JMX APIs for monitoring the Neo4j database. It contains only one package, org.neo4j.jmx.

Note

The Enterprise Edition of Neo4j comes with a new package, org.neo4j.management, which is used to provide advanced monitoring.

· Tooling: This provides the packages and classes for performing global operations over the graph database. It contains one package, org.neo4j.tooling.

· Imports: This contains the packages and classes for performing batch imports. It contains one package, org.neo4j.unsafe.batchinsert.

· Helpers: This contains the packages for providing helper classes such as common Java utilities or Iterator/Iterable utilities. It also contains only one package, org.neo4j.helpers.collection.

· Graph matching: This contains the packages and classes for pattern matching and filtering. It contains two packages: org.neo4j.graphmatching and org.neo4j.graphmatching.filter.

Note

The Enterprise Edition provides one more package, org.neo4j.backup, for performing various backups such as online, cold, and so on, of Neo4j the database.

Graph traversals

Neo4j provides a callback API for traversing the graph based on certain rules provided by the users. Basically, users can define an approach to search a graph or subgraph, which depends upon certain rules/algorithms such as depth-first or breadth-first.

Let's understand a few concepts of traversing:

· Path expanders: This defines what to traverse in terms of relationship direction and type

· Order of search: The types of search operations are depth-first or breadth-first

· Evaluator: This decides whether to return, stop, or continue after a certain point in traversal

· Starting nodes: This is the point from where the traversal will begin

· Uniqueness: This visits the nodes (relationships and paths) only once

Let's continue our Movie dataset, which we created in Chapter 3, Pattern Matching in Neo4j, and create a Java-based traverser for traversing the graph:

import org.neo4j.cypher.ExecutionEngine;

import org.neo4j.cypher.ExecutionResult;

import org.neo4j.graphdb.GraphDatabaseService;

import org.neo4j.graphdb.Node;

import org.neo4j.graphdb.Path;

import org.neo4j.graphdb.RelationshipType;

import org.neo4j.graphdb.Transaction;

import org.neo4j.graphdb.factory.GraphDatabaseFactory;

import org.neo4j.graphdb.traversal.Evaluators;

import org.neo4j.graphdb.traversal.TraversalDescription;

import org.neo4j.graphdb.traversal.Traverser;

import org.neo4j.kernel.impl.util.StringLogger;

import scala.collection.Iterator;

public class Traversals {

//This should contain the path of your Neo4j Datbase which is

//generally found at <$NEO4J_HOME>/data/graph.db

private static final String MOVIE_DB = "$NEO4J_HOME/data/graph.db";

private GraphDatabaseService graphDb;

public static void main(String[] args) {

Traversals movies = new Traversals();

movies.startTraversing();

}

private void startTraversing() {

// Initialize Graph Database

graphDb = new GraphDatabaseFactory().newEmbeddedDatabase(MOVIE_DB);

// Start a Transaction

try (Transaction tx = graphDb.beginTx()) {

// get the Traversal Descriptor from instance of GGraph DB

TraversalDescription trvDesc = graphDb.traversalDescription();

// Defining Traversals need to use Depth First Approach

trvDesc = trvDesc.depthFirst();

// Instructing to exclude the Start Position and include all others

// while Traversing

trvDesc = trvDesc.evaluator(Evaluators.excludeStartPosition());

// Defines the depth of the Traversals. Higher the Integer, more

// deep would be traversals.

// Default value would be to traverse complete Tree

trvDesc = trvDesc.evaluator(Evaluators.toDepth(3));

// Get a Traverser from Descriptor

Traverser traverser = trvDesc.traverse(getStartNode());

// Let us get the Paths from Traverser and start iterating or moving

// along the Path

for (Path path : traverser) {

//Print ID of Start Node

System.out.println("Start Node ID = "+ path.startNode().getId());

//Print number of relationships between Start and End //Node

System.out.println("No of Relationships = " + path.length());

//Print ID of End Node

System.out.println("End Node ID = " + path.endNode().getId());

}

}

}

private Node getStartNode() {

try (Transaction tx = graphDb.beginTx()) {

ExecutionEngine engine = new ExecutionEngine(graphDb,StringLogger.SYSTEM);

ExecutionResult result = engine

.execute("match (n)-[r]->() where n.Name=\"Sylvester Stallone\" return n as RootNode");

Iterator<Object> iter = result.columnAs("RootNode");

return (Node) iter.next();

}

}

}

Now let's understand the preceding code and discuss the important sections of the code. The code defines three methods:

· public static void main(..): This main method is the entry point of the traversals and defines the flow of code

· private Node getStartNode(): This executes the Cypher query and gets the node which is used as a starting point for our traversals

· private void startTraversing(): This is the core part of the code that defines the Traversals, rules, and iterates over the paths

Let's deep dive and understand the logic of the startTraversing() method.

TraversalDescription trvDesc = graphDb.traversalDescription();

The preceding line of code fetches an instance of TraversalDescription from an instance of the graph database with all its default values. TraversalDescription defines the behavior of traversals and provides various methods for defining rules that are further used while traversing graphs. There are two types of traversals: TraversalDescription and BidirectionalTraversalDescription. Both traversals can be retrieved from the object of graphDb.

trvDesc = trvDesc.depthFirst();

trvDesc = trvDesc.evaluator(Evaluators.excludeStartPosition());

trvDesc = trvDesc.evaluator(Evaluators.toDepth(3));

The preceding statements define the three most common rules that are followed while doing the traversals:

· depthFirst: This is rule 1, where the traversals should be evaluating the depth of the tree. You can also define breadthFirst for evaluating the breadth of the tree.

· excludeStartPosition: This is rule 2 that excludes the starting node and does not evaluate any paths that are originating and ending from the starting point itself.

· toDepth(3): This is rule 3, which defines the number of paths it should evaluate from the starting node. If nothing is specified, then it will evaluate all paths from the start node, that is, it will evaluate the complete graph.

You can also provide the uniqueness filter, which ensures uniqueness among the visited nodes/paths. For example, trvDesc.uniqueness(Uniqueness.NODE_GLOBAL) defines that a node should not be visited more than once during the traversing process.

We can also filter on the relationships and its direction. For example, we can add the following piece of code in Traversals.java for applying filters on relationships.

Add an Enum in Traversals.java:

public enum RelTypes implements RelationshipType {

ACTED_IN, DIRECTED

}

Then add the following code as the fourth rule for our traversal descriptor:

trvDesc.relationships(RelTypes.ACTED_IN, Direction.BOTH);

Note

Refer to org.neo4j.graphdb.traversal.Evaluators at http://neo4j.com/docs/2.1.5/javadocs/org/neo4j/graphdb/traversal/Evaluators.html for understanding the various rules provided by Neo4j.

After defining the rules, we get the instance of traverser from the TravesalDescription by providing a starting node and then we start traversing the graph. The instances of traverser are lazy loading, so it's performant even when we are dealing with thousands of nodes, that is, they will not start traversing or scanning until we start iterating over it.

Apart from traversals, Neo4j also provides a graph algorithm factory org.neo4j.graphalgo. GraphAlgoFactory that exposes the following prebuilt graph algorithms:

· GraphAlgoFactory.shortestPath(…): This returns the algorithm that finds the shortest path between two given nodes

· GraphAlgoFactory.allPaths(…): This returns the algorithm that finds all the possible paths between two given nodes

· GraphAlgoFactory.allSimplePaths(…): This returns the algorithm that finds all simple paths of a certain length between the given nodes

· GraphAlgoFactory.dijkstra(…): This returns the algorithm that finds the path with the lowest cost between two given nodes

· GraphAlgoFactory.aStar(…): This returns the algorithm that finds the cheapest path between two given nodes

· GraphAlgoFactory.pathsWithLength(…): This returns the algorithm that finds all simple paths of a certain length between two given nodes

In this section, you learned about the various Java APIs and methods exposed by Neo4j for performing traversing/searching over Neo4j graphs.

Summary

In this chapter, you have learned and discussed various deployment models, their applicability, and various tools/ utilities/ frameworks available for performing unit testing in Neo4j. Lastly, we also talked about different Java APIs exposed by Neo4j and its traversal framework for searching and finding data in Neo4j graphs.

In the next chapter, we will discuss the integration of Spring and Neo4j.