A discussion of the design and implementation of an RDF API.
What is a model but a set of RDF triples? This is an important question which I'll revisit later. But basically, an RDFModel is something that supports the basic triple-processing API, and some other bits.
More precisely, RDFNode, RDFResource, RDFLiteral. The latter are normally strings (or rather, their serialisations are all in string format when you look at them) but there's no reason why you can't store Java classes, Dates, Integers, org.falhs.package.MyThings in them.
I avoided the rather odd class heirarchy between these objects in my initial TripleStore implementation; hindsight leads me to believe that this was probably an error. Why? Because support for chaining (path expressions, transitive property types, etc.) expects that you can take a TNode (Terminal Node, the thing on the sharp end of an Arc) and turn it into an SNode (Start Node, the thing on the blunt end) pretty much at whim. Still not sure about the most general way to go about this; an RDF-centric approach seems best.
Basic types: Node, Resource, Literal. I've avoided "Property" for the simple reason that pretty much anyone can use any Resource as a Property; there's no programmatic way to distinguish between a Property and a non-Property resource.
Consequently, most methods requiring a statement will have parameters that fit the (Resource, Resource, Node) pattern.
A special case of Resource is AnonymousResource. There are some special methods that anonymous nodes require (to let an anonymous resource acquire an identity with respect to a given model); these methods are part of the general Resource API but will generally be unimplemented (or throw exceptions, although that might prove quite costly) for non-anonymous Resources.
The resource/URI distinction goes like this: a resource has an identity. This is (in the case of a non-anonymous resource) given by its URI representation; anonymous resources are dealt with below. Yes, one point of view is that a resource may have multiple URI representations. The attitude that this API takes is that it cannot know about those considerations in general (although there may be value-adding layers that support such semantics in specific cases), and so equality tests are punted to the URI object (ie, based on URI equality). We add to the API the 'folding' call which allows a global replacement of one URI by another; this, hopefully, will suffice in most cases.
Rationale: convenience. Don't want to be bothered with either casts or cast-method calls to turn a Property into a Resource, or vice versa. (Likely? YEs, but only when dealing with reification of schema work).
Historical accident. It might be useful (Java doesn't support the return of tuples) but I've avoided it thus far because of the large overhead associated with creating objects in Java. There might be a cunning way to implement a flyweight 'Triple' object but Java's lack of operator overloading leaves me groping for a way to do it cleanly. Too much C++, I guess. Triples do pop up as (almost) first-class citizens - see the TripleIterator. I might add a "Triple" class as a convenience but I'm reasonably convinced that I can support the architecture I want to without creating these objects explicitly all over the place.
In particular, there is a big problem with current axiomatisations, which all start
there is a set called Resources, ...
there is a set called Resources, and a set called URIs, and a mapping identifies:URIs->Resources.
followed by expressing all its ideas in terms of URIs instead of Resources. I kind of reiterate this idea regarding the distinction between "the reification" and "a reification" below.