The Different Places Something Can Go Wrong


This is just a little blog entry about the impact that the ontologies can have on functionality.

The ontologies are a set of vocabularies describing the types of resources stored in Nepomuk, the possible relations between these types, and the possible annotations. We have for example a type for local files, one for an address book entry, one for a person, one for music content and so on. We also have relations that describe that some person is the author or some piece of content and so on.

These ontologies are maintained in the Shared-Desktop-Ontologies project – to my knowledge the only real open-source project developing RDF ontologies.

Now to the actual topic. There once was a bug. Like so many other bugs it talked about file indexing in Nepomuk and like so many other bugs it said that some file could not be indexed. First it was Nepomuk’s fault, then it was the fault of libstreamanalyzer, but in the end I realized: there was a bug in the ontologies. More specificly in NMM – the Nepomuk MultiMedia ontology. (Granted this was not really the source of the hang the bug talks about but it was the reason the file could not be indexed.)

The problem was the domain of the nmm:setSize property. Each property has a domain and a range – the domain defines on which type of resource the property can be set, the range defines the type of the value. In other words they are defining the subject and object type of the triple. The domain is always a resource type (rdfs:Class), the range a resource or a literal type (typically one defined in the XML schema). In this case the domain of nmm:setSIze was set as nmm:MusicPiece whereas it should have been nmm:MusicAlbum. Thus, Nepomuk rejected the data generated by libstreamanalyzer as being invalid due to using an invalid domain. (Update: Nepomuk treats RDF data in a closed-world fashion. In comparison to the open-world approach which is typical for RDF/S resource types are not inferred from their relations. In an open-world situation the resource would simply end up being both a nmm:MusicPiece and a nmm:MusicAlbum.)

The solution is shared-desktop-ontologies 0.8.1 with the fixed domain. Installing it will make Nepomuk re-parse the changed ontology and indexing the mp3 files in question will finally work.

Well, this was pretty verbose for a rather small issue. Still it gave a little introduction into how the ontologies are used in Nepomuk. One more thing to take care of in the “Nepomuk universe”.

And as always:

Click here to lend your support to: Nepomuk - The semantic desktop on KDE and make a donation at www.pledgie.com !

9 thoughts on “The Different Places Something Can Go Wrong

  1. Good to know that another type of issue was identified!
    I dearly hope that strigi can be made back out in such cases, so it could move on instead of torturizing the system.

    I still have to disable file indexing, because after doing the “initial” (!sic) indexing that it does everytime it is turned on, virtuoso-t will stick to using 30-50%CPU forever, keeping the fan, disk etc busy for all time.
    Here is a sample line of output of 25983405983765037650378562805734s spit out when file indexing is running:
    [/usr/bin/nepomukservicestub] “/usr/bin/nepomukservicestub(3054)” Soprano: “SQLExecDirect failed on query ‘sparql select distinct ?r ?reqProp1 (bif:concat(bif:search_excerpt(bif:vector(‘anders’), ?v4))) as ?_n_f_t_m_ex_ where { { ?r ?v2 . ?v2 <http://www.semanticdesktop.org/ontologies/2007/03/22/nco#hasEma&#039; (iODBC Error: [OpenLink][Virtuoso iODBC Driver][Virtuoso Server]SQ074: Line 1: syntax error at '#')"

  2. Not wishing to nitpick, but your description of the the problem could lead to misinterpretation of how RDFS works, so just for the record:

    RDFS schemas are purely descriptive, they don’t enforce anything (unlike XML schemas).

    rdfs:domain and rdfs:range allow the inference of another additional triple, but don’t restrict the value of the subject or object of statements. In itself there’s nothing semantically wrong with:

    nmm:setSIze rdfs:domain nmm:MusicPiece .
    nmm:setSIze “10” .
    rdf:type nmm:MusicAlbum .

    though through an RDFS interpretation of the first two statements you can *also* infer that:

    rdf:type nmm:MusicPiece .

    i.e. is in both classes.

    If however you had stated:

    nmm:MusicPiece owl:disjointWith nmm:MusicAlbum .

    that would make the three statements above inconsistent (under an RDFS+OWL interpretation).

  3. Arrgh, it ate my subjects, here it is again again with square rather than angle brackets:

    …In itself there’s nothing semantically wrong with:

    nmm:setSIze rdfs:domain nmm:MusicPiece .
    [#thing] nmm:setSIze “10″ .
    [#thing] rdf:type nmm:MusicAlbum .

    though through an RDFS interpretation of the first two statements you can *also* infer that:

    [#thing] rdf:type nmm:MusicPiece .

    i.e. [#thing] is in both classes.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s