Navigation

Friday, August 12, 2005

Adding context to SeRQL

We're currently in the middle of design and implementation for Sesame 2.0, and one of the big things in this release is context support (or named graph support, if you prefer that term).

There are two sides to such a mechanism: one is the maintenance APIs and the physical realization of storage of statement context in a store, the other is how to add this to the query language. We've got the first part figured out pretty much, and are currently thinking about how to fit it into SeRQL.

Like SPARQL, SeRQL will distinguish between a default context and zero or more named contexts. How these different contexts relate (for example, if the default context contains the named contexts or is separate) will be a configurable feature of the store - so it is up to the publisher to decide what goes where. The default behaviour will be that the default context contains the union of all named contexts, and all inferred triples are part of the default context as well. This setting allows for the best 'backward compatible' behaviour of SeRQL as non-contextualized queries will just query the entire store.

Syntax then. There are several candidate syntaxes we are considering. Rendering the SPARQL example query from section 8.4 of the WD in different SeRQL alternatives:

  1. CONTEXT keyword:
    SELECT name, mbox, date    
    FROM  
      {g} dc:publisher {name};    
          dc:date {date},    
      CONTEXT g ( {person} foaf:name {name};    
                           foaf:mbox {mbox} )
  2. Multiple FROM clauses:
    SELECT name, mbox, date
    FROM
      {g} dc:publisher {name};
          dc:date {date}
    FROM g
      {person} foaf:name {name};
               foaf:mbox {mbox} 
  3. Virtual reification through a built-in 'serql:context' operator:
    SELECT name, mbox, date
    FROM 
      {{person} foaf:name {name};
                foaf:mbox {mbox}} serql:context {g} dc:publisher {name};
                                                    dc:date {date}
Each syntax variant has its own (dis)advantages. My current favourite is the multiple-FROM option, since it seems to be one that is easiest to read for the user. The other syntaxes suffer a bit from rampant bracket-opening-and-closing.