First Contact
Explore foundational concepts of graph data structures in Elixir by learning key terms such as nodes and edges. Understand the differences between property graphs and RDF models to prepare for building and querying graphs effectively.
We'll cover the following...
Graphs as networks
Let’s establish some of the terms we’ll use. First of all, when we talk about graphs, we obviously mean networks—not charts. Traditionally the term “graph” has referred to a diagram or plot of one quantity against another. This is the more widely understood sense of the term. But that’s not our concern here. We’ll be using it in its other sense of a data structure used to model relationships between things. That is, we have a set of things and another set of relationships between those things. Those two sets together constitute a graph. In its most general sense, a graph is just a data model for relating a collection of things.
Network vs. Graph
Network | Graph |
Network is the term used in network science. | Graph is the preferred term used in graph theory. |
A network is an engineering implementation of a graph and is typically a dynamical system concerned with a flow, or flows, through the structure. | A graph is typically understood holistically as a static construct. |
Vertex/edge—what?
As noted, there are two components in a graph—things and relationships. In practice, we’ll find many different terms for these graph building blocks:
- vertex/edge: These terms are used in graph theory, a formal theory in math.
- node/link: These terms are used in network science theory.
- node/relationship: These terms are used in the Neo4j graph database.
- node/arc: These terms are used in the RDF graph data model.
- dot/line: These terms are sometimes used for descriptiveness.
- object/arrow: These terms are used in category theory, a foundational theory in math (a category is just a graph with additional structure).
But it really doesn’t matter which terms we use. It’s probably best to keep the vertex/edge pairing when dealing with topics in graph theory and to use the node/link pairing when talking about networks. We, however, are going to use the terms node and edge, although we’ll sometimes use the term vertex for node.
Graph models
Of course, there is more to all this than just nodes and edges. There are code libraries for modeling graphs and for running graph algorithms. Then there are graph databases that implement particular graph models. There are two main graph models that are supported by graph databases—the property graph model, sometimes referred to as the labeled property graph, and the RDF model. We can list out a direct feature comparison between these two graph models as shown in the following table:
Feature Comparison Between Property and RDF Graph Models
Features | Property Graph | RDF |
Sponsor | Industry | W3C |
Standards | No | Yes |
Field of origin | Database | Documents (web) |
Published | 2007? | 1999 |
Strength | Graph exploration | Data integration |
Query language | Cypher, Gremlin | SPARQL |
Names | System | Global (IRI) |
Annotations Nodes Edges | Attributes Attributes | Edges (with string nodes) — |
This obviously is an oversimplification of the current position. For example, although property graphs are not standards-based, there is ongoing work to surface aspects of the model within various standards bodies. At the same time, there is the new development of RDF* (with SPARQL*), which seeks to close the gap between property graphs and RDF graphs by addressing the edge annotation problem.
We need to look more closely at the different graph models as we explore the actual graph packages.