Learn Cypher by Doing and Reading “Graph Data Processing with Cypher”

A Book Review

Sixing Huang
4 min readDec 26, 2022
Photo by Thought Catalog on Unsplash

With the fast spread of knowledge graphs, graph databases, such as Neo4j, are becoming trendy. Neo4j’s query language Cypher is therefore gaining popularity quickly. Cypher is easy to learn. It is similar to SQL. And its syntax is inspired by ASCII art: the arrows and parentheses in Cypher represent relations and nodes, respectively. Finally, there are ample learning materials on the internet.

We can learn Cypher either by doing or by reading. If you go down the first path, your primary goal is to finish various Neo4j projects. And you get the language bit by bit along the way. But without a systematic overview, it is hard to know all the intricate details. And you only learn the parts of the language that your projects expose you to. This is similar to how babies learn their native languages. They learn in daily interactions without diving into the grammar books first. That may explain why native speakers often know ‘how’ but not ‘why’ in the language.

In contrast, you can build a foundation with introductory books or videos first. This second way gives you a quick overview and lets you know what is possible or not with the language. Afterward, you can turn the readings into knowledge and wisdom by getting your hands dirty.

I began my Cypher/Neo4j journey with projects. Tutorials are abundant on the internet. And I have found that writing them down medium.com, such as 1, 2, 3, and 4, is a good way to document and consolidate my learning. I have a lot of fun, too. But I do notice that I had to google the basics quite often. So I plan to read a book to complement my Cypher education.

Figure 1. Cover of Graph Data Processing with Cypher by Ravindranatha Anthapu. Image by the publisher.

Serendipitously, the publisher Packt asked me to review her new book Graph Data Processing with Cypher by Ravindranatha Anthapu (Figure 1). This is a Cypher introductory book that requires no prior knowledge of the language. And it turns out to be an excellent Cypher manual.

Because the book assumes no prior knowledge, it really begins with the absolute basics, such as graph theory, the installation of Neo4j Desktop, and the first impression of the Neo4j Browser. Then it goes into the Cypher language from data loading, querying, and tuning to APOC. In the querying part, the author spilled ink on the syntaxes, filtering, sorting, aggregations, and two important data types: lists and maps.

But the book is not just a dry learning-by-reading exercise. Anthapu uses data generated by the Synthea project to build a graph where he demonstrates his teaching. If the readers follow all the steps, they can get valuable hands-on experience. Synthea simulates the relationships among patients, healthcare providers, medications, and encounters. The data is realistic but not real. In addition, the author can use it without protected health information (PHI) and personally identifiable information (PII) constraints. By the way, I also used Synthea before in my Doctor.ai.

My read-through has not only reinforced my Cypher knowledge but also filled a lot of gaps in my understanding. For example, it mentions that the MERGE operation is not thread-safe. That is, running MERGE in parallel can create multiple instances of the same nodes. But the book also points to a way out of this: constraints. It also teaches me how to simulate the IF conditions with FOREACH or APOC. I knew that indexing and labeling have huge impacts on query performance. But I was surprised to learn that we should create distinct relationships and avoid specifying the node labels except the anchor node when we query. Because distinct relationships can indicate the next labels, writing them only adds one extra step to the execution without any benefit. Also, the book advises us to use boolean values to filter and leverage count stores to get the counts quickly.

Conclusion

There are many free Cypher handbooks online. The most popular one must be the official Cypher guide from Neo4j. But for an under-appreciated language like Cypher, there will never be enough good materials. Although there are some typos, this book brings all the Cypher basics into one place with a reproducible project. It is beginner-friendly. The structure is clear. And the author guides the readers from the basics to advanced topics smoothly.

If you need to start learning Cypher, start here.

--

--

Sixing Huang

A Neo4j Ninja, German bioinformatician in Gemini Data. I like to try things: Cloud, ML, satellite imagery, Japanese, plants, and travel the world.