A simple test-case manifest format.
Part of the RDFCore work includes the preparation of test cases for parser compliance, inference engines, and so forth.
The intention is that these test cases should be available for download in a simple zipped bundle.
We thus need a way to automate the running of test cases; the simplest option seems the inclusion in the zip archive of a manifest file that describes the type of test to perform and the inputs that the test takes.
Tests seem to really be boolean functions of a set of inputs; the inputs in question are generally pieces of RDF (or ntriples). Tests we've got thus far are:
I think Jeremy has some other test types he'd ilke to include.
Note that not all inputs to a test need to be documents; tests could be parameterised by other literal values too.
Primarily, this manifest should be simple to read and parse; thus, even if it is couched in RDF/XML, care should be taken when producing the document to keep the structure as regular as possible so that test harnesses don't need to have overly complex parsers in order to handle manifest processing.
The test-case descriptions should be extensible, to permit the addition of new test cases and types of tests in the future; test types should be clearly marked so that it is possible, for instance, to simply run the parser tests and ignore entailment tests.
Most inputs to tests are small RDF documents (in RDF/XML or ntriples format). We need to be able to describe the document (giving an expected base URI and local address).
Some tests (particularly parser tests) rely on having a known base URI for the test RDF/XML document. However, the document will obviously also be instantiated locally.
After a discussion on #rdfig, it seems that the simplest method to identify a document would be as a resource that has base URI and local instantiation properties. The next question that arises is a stylistic one: where and how should we store the base URI and the address of the local document copy?
Options are as follows (we reject using file: URIs for local copies as being unnecessarily unwieldy):
<test:RDFDocument rdf:about="absolute-address-of-test-document"> <test:localInstantiation rdf:resource="relative-address-of-test-document"> </test:RDFDocument>
<test:RDFDocument> <test:baseURI rdf:resource="absolute-address-of-test-document"> <test:localInstantiation rdf:resource="relative-address-of-test-document"> </test:RDFDocument>
This is arguably preferable to the first case, since it carefully skirts the issue of whether a URL denotes "the" document that dereferencing it produces.
<test:RDFDocument test:baseURI="absolute-address-of-test-document" test:localInstantiation="relative-address-of-test-document" />
This is arguably preferable to the second case, since it treats the address of a document as being explicitly differentiated from a URI which may denote it. This is particularly important since the lexical content of both the base URI and the local address of the document will be of interest to a parser. See the thoughts I had about this after discussing it with DanC and Aaron.
The downside of this choice is that it may prove confusing (even contentious) to the RDF community. However, the discussion of the distinction (if any) between denotation and dereferencing of a URL has to happen sooner or later.
Less relevantly, it also produces slightly more compact XML.
Finally, I present a simple example and the corresponding schema for perusal. It just comes down to taking a view on this; we should aim to err on the side of simplicity whilst avoiding committing the usual "document/instantiation confusion" sins.
On #rdfig, DanC pounced on the idea of treating a packaged ZIP file as a cache for an http proxy (his words were "it's a killer").
This is indeed an attractive idea, but to be honest, I think it's what we already have here. The pair of properties that describe a document (original or effective address and the address of the local copy) are exactly what such a cache's index would contain. In this case, however, we intersperse the test information with these locators to make parsing the manifest.rdf simple.
If someone can point me to a simple pre-existing schema for describing such an index, though, I see no reason why we shouldn't adopt such a schema rather than reinventing it.