Loading and saving RDF¶

Reading RDF files¶

RDF data can be represented using various syntaxes (turtle, rdf/xml, n3, n-triples, trix, JSON-LD, etc.). The simplest format is ntriples, which is a triple-per-line format.

Create the file demo.nt in the current directory with these two lines in it:

<http://example.com/drewp> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .
<http://example.com/drewp> <http://example.com/says> "Hello World" .

On line 1 this file says “drewp is a FOAF Person:. On line 2 it says “drep says “Hello World””.

RDFLib can guess what format the file is by the file ending (“.nt” is commonly used for n-triples) so you can just use parse() to read in the file. If the file had a non-standard RDF file ending, you could set the keyword-parameter format to specify either an Internet Media Type or the format name (a list of available parsers is available).

In an interactive python interpreter, try this:

from rdflib import Graph

g = Graph()
g.parse("demo.nt")

print(len(g))
# prints: 2

import pprint
for stmt in g:
    pprint.pprint(stmt)
# prints:
# (rdflib.term.URIRef('http://example.com/drewp'),
#  rdflib.term.URIRef('http://example.com/says'),
#  rdflib.term.Literal('Hello World'))
# (rdflib.term.URIRef('http://example.com/drewp'),
#  rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'),
#  rdflib.term.URIRef('http://xmlns.com/foaf/0.1/Person'))

The final lines show how RDFLib represents the two statements in the file: the statements themselves are just length-3 tuples (“triples”) and the subjects, predicates, and objects of the triples are all rdflib types.

Reading remote RDF¶

Reading graphs from the Internet is easy:

from rdflib import Graph

g = Graph()
g.parse("http://www.w3.org/People/Berners-Lee/card")
print(len(g))
# prints: 86

rdflib.Graph.parse() can process local files, remote data via a URL, as in this example, or RDF data in a string (using the data parameter).

Saving RDF¶

To store a graph in a file, use the rdflib.Graph.serialize() function:

from rdflib import Graph

g = Graph()
g.parse("http://www.w3.org/People/Berners-Lee/card")
g.serialize(destination="tbl.ttl")

This parses data from http://www.w3.org/People/Berners-Lee/card and stores it in a file tbl.ttl in this directory using the turtle format, which is the default RDF serialization (as of rdflib 6.0.0).

To read the same data and to save it as an RDF/XML format string in the variable v, do this:

from rdflib import Graph

g = Graph()
g.parse("http://www.w3.org/People/Berners-Lee/card")
v = g.serialize(format="xml")

The following table lists the RDF formats you can serialize data to with rdflib, out of the box, and the format=KEYWORD keyword used to reference them within serialize():

RDF Format	Keyword	Notes
Turtle	turtle, ttl or turtle2	turtle2 is just turtle with more spacing & linebreaks
RDF/XML	xml or pretty-xml	Was the default format, rdflib < 6.0.0
JSON-LD	json-ld	There are further options for compact syntax and other JSON-LD variants
N-Triples	ntriples, nt or nt11	nt11 is exactly like nt, only utf8 encoded
Notation-3	n3	N3 is a superset of Turtle that also caters for rules and a few other things

Trig	trig	Turtle-like format for RDF triples + context (RDF quads) and thus multiple graphs
Trix	trix	RDF/XML-like format for RDF quads
N-Quads	nquads	N-Triples-like format for RDF quads

Working with multi-graphs¶

To read and query multi-graphs, that is RDF data that is context-aware, you need to use rdflib’s rdflib.ConjunctiveGraph or rdflib.Dataset class. These are extensions to rdflib.Graph that know all about quads (triples + graph IDs).

If you had this multi-graph data file (in the trig format, using new-style PREFIX statement (not the older @prefix):

PREFIX eg: <http://example.com/person/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

eg:graph-1 {
    eg:drewp a foaf:Person .
    eg:drewp eg:says "Hello World" .
}

eg:graph-2 {
    eg:nick a foaf:Person .
    eg:nick eg:says "Hi World" .
}

You could parse the file and query it like this:

from rdflib import Dataset
from rdflib.namespace import RDF

g = Dataset()
g.parse("demo.trig")

for s, p, o, g in g.quads((None, RDF.type, None, None)):
    print(s, g)

This will print out:

http://example.com/person/drewp http://example.com/person/graph-1
http://example.com/person/nick http://example.com/person/graph-2

Loading and saving RDF¶

Reading RDF files¶

Reading remote RDF¶

Saving RDF¶

Working with multi-graphs¶

Table of Contents

Previous topic

Next topic

This Page