What Nepomuk Can do and How You Should Use it (as a Developer)


Nepomuk has been around for quite a while but the functionality exposed in KDE 4.3 is still not that impressive. This does not mean that there does not exist cool stuff. It only means that there is not enough developer power to get it all stable and integrated perfectly. Let me give you an overview of what already exists in playground and how it can be used (and how you should use it).

The Basics

For starters there is the Nepomuk API in kdelibs which you should get familiar with.Most importantly (we will use it quite a lot later on) there is Nepomuk::Resource which gives access to arbitrary resources in Nepomuk.

Nepomuk::Resource file( myFilePath );
file.addTag( Nepomuk::Tag( “Fancy stuff” ) );
QString desc = file.description();
QList<Nepomuk::Tag> allTags = Nepomuk::Tag::allTags();

Resource allows simple manipulation of data in Nepomuk. Using some fancy cmake magic through the new NepomukAddOntologyClasses macro in kdelibs data manipulation gets even simpler. The second basic thing you should get familliar with is Soprano and SPARQL. As a quickstart the following code shows how I typically create queries using Soprano:

using namespace Soprano;

Model* model = Nepomuk::ResourceManager::instance()->mainModel();
QString query = QString( “prefix nao:%1 “
                         “select ?x where { “
                         “%2 nao:hasTag ?t . “
                         “?r nao:hasTag ?t . }” )
        .arg(Node::resourceToN3(Vocabulary::NAO::naoNamespace()))
        .arg(Node::resourceToN3(file.resourceUri()));
QueryResultIterator it
        = model->executeQuery( query, Query::QueryLanguageSparql );

As you can see there is always a lot of QString::arg involved to prevent hard-coding of URIs (again Soprano provides some cmake magic for generating Vocabulary namespaces).

These are the basics. Without these basics you cannot use Nepomuk.

Debugging Nepomuk Data

Now before we dive into the unstable, experimental, and really cool stuff let me mention sopranocmd.

sopranocmd is a command line tool that comes with Soprano and allows to perform virtually any operation possible on the Nepomuk RDF database. It has an exhaustive help output and you should use it to debug your data, test your queries and the like (if anyone is interested in creating a graphical version, please step up).

The Nepomuk database (hosting only a single Soprano model called “main”) can be accessed though D-Bus as follows:

sopranocmd --dbus org.kde.NepomukStorage --model main \
      query "select ?r where { ?r ?p ?o . }"

The Good Stuff

There is quite a lot of experimental stuff in the playground but I want to focus on the annotation framework and Scribo.

The central idea of the annotation framework is the annotation suggestion which is encapsulated in the Annotation class (Hint: run “make apidox” in the annotationplugin folder). Instead of the user manually annotating resources (adding tags or relating things to other things) the system proposes annotations which the user then simply acknowledges or discards. These Annotation instances are normally created by AnnotationPlugin instances (although it is perfectly possible to create them some other way) which are trigged through an AnnotationRequest.

Before I continue a short piece of code for the impatient:

Resource res = getResource();

AnnotationPluginWrapper* wrapper = new AnnotationPluginWrapper();
wrapper.setPlugins( AnnotationPluginFactory::instance()
   ->getPluginsSupportingAnnotationForResource( res.resourceUri() ) );
connect( wrapper, SIGNAL(newAnnotation(Nepomuk::Annotation*)),
         this, SLOT(addNewAnnotation(Nepomuk::Annotation*)) );
connect( wrapper, SIGNAL(finished()),
         this, SLOT(slotFinished()) );

AnnotationRequest req;
req.setResource( res );
req.setFilter( filter );
wrapper->getPossibleAnnotations( req );

The AnnotationPluginWrapper is just a convenience class which prevents us from connecting to each plugin separately. It reproduces the same signals the plugins emit.

The interesting part is the AnnotationRequest. At the moment (the framework is under development. This also means that your ideas, patches, and even refactoring actions are very welcome) it has three parameters, all of which are optional:

  1. A resource – The resource for which the annotation should be created. This parameter is a bit tricky as the Annotation::create method allows to create an annotation on an arbitrary resource but in some cases it makes perfect sense to only create annotation suggestions for only one resource.
  2. A filter string – A filter is supposed to be a short string entered by the user which triggers an auto-completion via annotations. Plugins should also take the resource into account if it is set.
  3. A text – An arbitrary long text which is to be analyzed by plugins. Plugins would typically extract keywords or concepts from it. Plugins should also take resource and filter into account if possible. This is where the Scribo system comes in (more later).

Plugins that I already created include very simple ones like the tag plugin which matches the filter to existing tag names and also excludes tags already set on the resource. Way more interesting are other plugins like the pimotype plugin which matches the filter to pimo types and proposed to use that type or the pimo relation plugin which allows to create relations via a very simple syntax: “author:trueg“. The latter will match author to existing properties and trueg to a value based on the property range. One step further goes the geonames annotation plugin which matches the filter or the resource label to cities or countries using the geonames web service. It will then propose to set a location or (in case the resource label was matched) to convert the resource into a city or country linking to the geonames resource.

A picture says more than a thousand words. Thus, here goes:

annotations-english

What do we see here? The user entered the text Paris in the AnnotationWidget (a class available in the framework) and the framework then created a set of suggested annotations. The most likely one is Paris, the city in France as sugested by the geonames plugin. The latter also proposes a few not so likely places. The pimotype plugin proposes to create a new type named Paris and the tag plugin proposes to create a new tag named Paris. Here I see room for improvement: if we can relate to the city Paris there is no need for the tag. Thus, some more sophisticated rating and comparision may be in order.

Now let us bring Scribo into play. Scribo is another framework in the playground which provides an API for text analysis and keyword extraction. It is tied into the annotation framework through a dedicated plugin which uses the TextAnnotation class to create annotations on specific text positions. The TextAnnotation class is supposed to be used to annotate text documents. It will create a new nfo:TextDocument and make it a nie:isPartOf the main document. Then the new resource is annotated according to the implementation.

The Scribo framework will extract keywords and entities from the text (specified via the AnnotationRequest text field) via plugins which will then be used to create annotation suggestions. There currently exist three plugins for Scribo: the datetime plugin extracts dates and times, the pimo plugin matches words in the text to things in the Nepomuk database, and the OpenCalais plugin will use the OpenCalais webservice to extract entities from the text.

You can try the Scribo framework by using the scriboshell which can be found in the playground, too:

scriboshell3

Paste the text to analyze in the left view and press the “Start” button. The right panel will then show all found entities and keywords including the text position and relevance.

The other possibility is to directly use the resourceeditor which is part of the annotation framework and bundles all gui elements the latter has to offer in one widget. Call it on a text file and you will get a window similar to the following:

resourceeditor

At the top you have the typical things: editable label and description, the rating, and the tags. Below that you have the exisiting properties and annotations. In the picture these are only properties extracted by Strigi. Then comes the interesting part: the suggestions. Here you can see three different Scribo plugins in action. First the pimo plugin matched the word “Brein” to an event I already had in my Nepomuk database. Then there is the OpenCalais plugin which extracted the “Commission of European Communities” (so far the plugin ignores the additional semantic information provided by OpenCalais) and proposes to tag the text with it.

The last suggested annotation that we can see is “Create Event“. This is a very interesting hack I did. The Scribo plugin detected the mentioning of a project, a date, and persons and thus, proposes to create an event which has as its topic the project and takes place at the extracted time. Since it is a hack created specifically for a demo its results will not be very great in many situations. But it shows the direction which I would like to take.

Below the suggestions you can see the AnnotationWidget again which allows to manually annotate the file.

How to Write an AnnotationPlugin

This is a Howto in three sentences: Derive from AnnotationPlugin and implement doGetPossibleAnnotations. In that method trigger the creation of annotations. Your annotations can be instances of SimpleAnnotation or be based on Annotation and implement at least doCreate, exists, and equals .

class MyAnnotationPlugin : pubic Nepomuk::AnnotationPlugin
{
public:
    MyAnnotationPlugin(QObject* parent, const QVariantList&);
protected:
    void doGetPossibleAnnotations(const Nepomuk::AnnotationRequest&);
};

void MyAnnotationPlugin::doGetPossibleAnnotations(
      const Nepomuk::AnnotationRequest& request
)
{
    // MyFancyAnnotation can do all sorts of crazy things like creating
    // whole graphs of data or even openeing another GUI
    addNewAnnotation(new MyFancyAnnotation(request));

    // SimpleAnnotation can be used to create simple key/value pairs
    Nepomuk::Types::Property property(Soprano::Vocabulary::NAO::prefLabel());
    Nepomuk::SimpleAnnotation* anno = new Nepomuk::SimpleAnnotation();
    anno->setProperty(property);
    anno->setValue("Hello World");
    // currently only the comment is used in the existing GUIs
    anno->setComment("Set label to 'Hello World'");
    addNewAnnotation(anno);

    // tell the framework that we are done. All this could also
    // be async
    emitFinsihed();
}

And Now?

At the Nepomuk workshop Tom Albers already experimented with integrating the annotation suggestions into Mailody. It is rather simple to do that but the framework still needs polishing. More importantly, however, the created data needs to be presented to the user in a more appealing way. In short: I need help with all this!

Integrate it into your applications, improve it, come up with new ways of presenting the information, write new plugins. Jump on board of the semantic desktop train.

Thanks for reading.

26 thoughts on “What Nepomuk Can do and How You Should Use it (as a Developer)

  1. Pingback: What Nepomuk Can do and How You Should Use it (as a Developer) « Fried #

  2. I think the worst part is that all this technology is stuk in playground. The pimo and pimoshel concepts may not be mature enough for general consumption but could be pushed on to distributions as experimental packages. If users (and developers alike) start seeing things going downstream and geting used the volunteers will show up. I don’t know the real solution to get it started but from my part I started submiting whishes against kubunto to package a few of the stuff (pimo and peopletag) in playground. And no, people just can’t compile from source because of all the dependencies they would have to install beforehand. Anyway, keep up with the good work.

  3. We should get this from playground to downstream for testing and developing. Nepomuk is great technology and people really is waiting to see it on KDE. But since 4.1 we have not gained anything what would make it advantage over Vista search or Spotlight.

    I hope that much as possible the KDE developers would join to bring the nepomuk features for 4.3. It is sad that it is feature freezed. 4.4 is over 6 months away now and windows 7 is out then with snow leopard.

    It is just sad that the great technology has not get ready for earlier versions. People soon starts believe on later releases (4.4 or 4.5) that KDE rips off the search functionality from windows. Already too much toughts about how KDE tries to be windows 7 clone etc.

    I am sorry but the KDE would need to be littlebit more intrested about marketing. Even that it is about open source where people will be volunteered to develope the software. We should focus more about important technologies. Example, the K3b or Ark are important but the downstream did take beta versions of them in use. Even that they were not ready. Same thing is happening with Nepomuk. We got great technology but we just can not get it out in time when needed.

    It would even be helpfull to give info for normal users now how to start tagging, rating and searching the metadata of files. Not just developers but the normal users who like to test the nepomuk. Example what we need to use it. How to use it, with what commands and ways.

    • “But since 4.1 we have not gained anything what would make it advantage over Vista search or Spotlight.”

      I think the main problem is the unfinished migration to Akonadi. The biggest bunches of data for the big semantic web come from the file system and the PIM applications. While Dolphin and Strigi handle the file system part, adoption of Akonadi and Nepomuk for PIM applications is happening only very slowly due to lack of manpower (and because, of course, one has to take extra care of data security in this category of applications).

  4. @Fry13: Ark and K3b are in extragear, not in playground which makes them as much more stable apps in the eyes of the packagers than anything in playground. That’s what playground was made for. But this gives anything there a lot less visibility than any other piece of sofware.

    The solution is for the developers to put forward simple non-desstructive example applications that make use of the technology. These must be regrded as technology previews but also must be in very good shape and ve extremely easy to package. Only then can we, the users in the comunity, do our jobs and submit bugs and wishes so that packagers pick them up easyly.

    Take an example. The soprano virtuoso backend that could ultimately give rise to a unified storage layer between nepomuk and akonadi. No distribution is going to package virtuoso just for the fun of it. They would need someone willing to be using it. But as long as virtuoso is not packaged, there will be no massive development on a nepomu backend. For this speciffic case I would suggest to put forward a usable virtuoso backend so I could submit bugs and wishes asking for the respective packages.

    As always I hopeI have contributed to this discussion and am willing to test whatever packages you through at me (it is impossible to compile most things on an asus eee-pc) ;

    • @Luis

      “Ark and K3b are in extragear, not in playground which makes them as much more stable apps in the eyes of the packagers than anything in playground. That’s what playground was made for. But this gives anything there a lot less visibility than any other piece of sofware.”

      I did not mean that they would be on playground. But that the same problem will rise with the nepomuk than it rised on K3b and Ark, while distributors shared the unstable versions from them because people needed good cd burner and package archiver. So in the end people ended up to have such applications what crashed all the days.

      Same thing as KDE4 got bad reputation when distributors started to distribute it as default desktop too early, instead just waiting next release to get 4.1 out or even 4.2. The 4.0 as default desktop (or even 4.1) was “suicide” for many KDE users. I suggested to all my friends to wait 4.2 until even give it a try. For few I suggested to move to Gnome for a while. They all have upgraded/returned to KDE now after 4.2.x was released and they are happy about it.

      We (distributors) should push nepomuk out for testing usage now. Get those members who like to test new packages right away and submitting bugs to do their part of work. Definitely not as default features, but just for testing repositories what needs enabled by manually. Not all testers like to compile code every day from SVN/GIT.

      I remember always now how K3b and Ark got bad name, just because they were needed features and they were left behind the other technologies. We can not blame developers but just simply the lacking of the man power. I believe if we could just focus sometimes a few days or a week a top, by everyone to one application/feature what is needed. Discuss about it’s needed features and we would get in few months a lots of wanted/needed things done. I like to thing that if we would get 20-30 developers, documenters etc, more to work Nepomuk about one week we would get it much better shape. It could be more like “bug hunting week” what Kubuntu is organizing for their users. Simply guides for new users how to compile etc.

  5. I am following your work since before 4.0 and I stringly believe that this will be one of the most important features in KDE4, if not the Linux desktop during the coming years. I understand that it is frustrating for you to see that very little has been done with your excellent work so far (apart from ratings and tags in a few applications).

    I believe that the situation would improve drastically if you were able to show that the technology works realiably and suffiently fast for simple use cases, most importantly file search. Releasing a very simple search application similar to Kerry could prove this and would a) encourage developers to spend time on implementing the APIs you developed, b) tease packagers so they package and release playground stuff as experimental apps and c) make users believe there is real potential and thus demand broader implementation by developers.

    I strongly believe that not having an – albeit simple – flagship program which does more than ratings really hurts this project.

    • I believe that too, if we could get somekind working search (not just krunner) implemted to desktop, it would really give more focus to Nepomuk. I would like to see soon the searchbar on open and save dialoge. Like KGet has on right clicking and Dolphin. But I think they should be more like kerry (own window too, or better plugin to krunner) or own sidepanel on the dolphin. Like the “information” panel next to “places”, there could be own kind panel just for tagging, rating, searching and adding all other information. moving the searchbar from toolbar to sidepanel, or as on window to desktop.

      • The search is being worked on in a GSOC project. I hope that for KDE 4.4 we can achieve more. In the end search was never interesting for me. Maybe that is the problem. I always saw file indexing and desktop search as features I would need to get users happy, but never as something interesting.

        • Maybe the problem is that many KDE software developers do not actually know the potential (and the need) for nepomuk for desktop. I am littlebit tired to search files myself. Keep trackin and organizing them. What I always just admire is the spotlight. You can just type something what you remember from the file and you find it.
          Even that nepomuk has better features and plans, it can not just easily show the files, group them and actually help user.

          You are doing great job on the technology but the other party is someway lacking intresting about it.

          I believe we should get nepomuk in the public more. Because many KDE user do not know at all about it. They do not know what is semantic search and what nepomuk has to do with it.

          I do not even know what is the anakondi and how it works. Just it is somekind search for PIM applications but the technology is totally “dark”.

          It would be nice that some point someone would easily explain the nepomuk, strigi, anakondi, soprano etc for normal users, with some diagrams. ;-)

    • I also agree with the statement above. Nepomuk seams very interesting technology, addressing the semantic desktop in a very broad way.

      Currently I really do miss the spotlight functionality. That is:
      – Ctrl+Space or click in systray -> searchbox opens
      – results from file system, email, pictures, music app icons are shown on the fly while typing.

      Thinking of this, we almost have this in the form of KRunner.

      MacOS also seams to do something interesting: each time I file is saved/created by the user, the spotlight daemon receives a small request to update the index for that file. So no extensive scanning, and yet the latest stuff is always available.

      These two features would make nepomuk a killer feature for me already. Everything else after that would make it rock so hard that I can’t yet imagine how that would be. :)

  6. Strigi/nepomuk is something I’m _really_ longing for. Imagining it in it’s full glory, I can just think “Wow!”. Reality however is a bit sad. Now in 4.3 even dolphin has this magic search field. Just I’ve no idea what to do with it… all I can do is searching for comments, apparently, not even tags.

    Trying to activate Strigi I only get: “Strigi service failed to initialize, most likely due to an installation problem.” …. eerrrm..yes. I think I’ve installed everything necessary… but then I’m lost…

    Getting basics like fulltext search, search for tags and comments working reliably would be a huge step forward already. Before going on with fancy semantic social whatever desktop stuff and what not.

    This is from a users perspective.

    • @redm

      “Trying to activate Strigi I only get: “Strigi service failed to initialize, most likely due to an installation problem.” …. eerrrm..yes. I think I’ve installed everything necessary… but then I’m lost…”
      I believe you have common problem. Install Sun Java (I have 1.6) and it starts to work. I have tought on many computer why strigi does not work. It does install all what it nees, but not the Java (what it needs :)

      “Now in 4.3 even dolphin has this magic search field. Just I’ve no idea what to do with it… all I can do is searching for comments, apparently, not even tags.”
      It is sad that on current version of KDE development, the dolphin search bar has nothing usefull to give. Searching is impossible (altought I managed to do get it work once) by any means. KRunner works but I would like to get Dolphin feature work. Usually I then loose the search bar totally if I try to edit toolbar and it forces dolphin to open two windows, other empty with toolbar panel (not showing it) and other normal without searcbar. Bug entries filled about it. But hopefully it will not be released on that state.

  7. I’m just another user looking forward to this technology reaching its maturity.

    I’d appreciate it if you could move all the Nepomuk documentation on your blog to a central location at some point in time. I don’t know much C++ at this point, but learning is on my todo list and this looks like stuff I’d want to play with when I get there. Having the documentation in a central location and not having to search back through years of old blog posts would make that much easier.

  8. This is all very nice and pretty, but UNLESS WE CAN USE YOUR TECHNOLOGY, IT IS USELESS.

    When is the Soprano backend going to be ready to work in Fedora and without Java in general? Until it’s ready — it’s been over a year — this is all navelgazing. The virtuoso backend was dropped in 2.3.0 according to the change logs in the KDE unstable packages for Fedora.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s