What We Did Last Summer (And the Rest of 2009) – A Look Back Onto the Nepomuk Development Year With an Obscenely Long Title


2009 is over. Yeah, sure, trueg, we know that, it has been over for a while now! Ok, ok, I am a bit late, but still I would like to get this one out – if only for my archive. So here goes.

Virtuoso

Let’s start with the major topic of 2009 (and also the beginning of 2010): The new Nepomuk database backend: Virtuoso. Everybody who used Nepomuk had the same problems: you either used the sesame2 backend which depends on Java and steals all of your memory or you were stuck with Redland which had the worst performance and missed some SPARQL features making important parts of Nepomuk  like queries unusable. So more than a year ago I had the idea to use the one GPL’ed database server out there that supported RDF in a professional manner: OpenLink’s Virtuoso. It has all the features we need, has a very good performance, and scales up to dimensions we will probably never reach on the desktop (yeah, right, and 64k main memory will be enough forever!). So very early I started coding the necessary Soprano plugin which would talk to a locally running Virtuoso server through ODBC. But since I ran into tons of small problems (as always) and got sidetracked by other tasks I did not finish it right away. OpenLink, however, was very interested in the idea of their server being part of every KDE installation (why wouldn’t they ;)). So they not only introduced a lite-mode which makes Virtuoso suitable for the desktop but also helped in debugging all the problems that I had left. Many test runs, patches, and a Virtuoso 5.0.12 release later I could finally announce the Virtuoso integration as usable.

Then end of last year I dropped the support for sesame2 and redland. Virtuoso is now the only supported database backend. The reason is simple: Virtuoso is way more powerful than the rest – not only in terms of performance – and it is fully implemented in C(++) without any traces of Java. Maybe even more important is the integration of the full text index which makes the previously used CLucene index unnecessary. Thus, we can finally combine full text and graph queries in one SPARQL query. This results in a cleaner API and way faster return of  search results since there is no need to combine the results from several queries anymore. A direct result of that is the new Nepomuk Query API which I will discuss later.

So now the only thing I am waiting for is the first bugfix release of Virtuoso 6, i.e. 6.0.1 which will fix the bugs that make 6.0.0 fail with Nepomuk. Should be out any day now. :)

The Nepomuk Query API

Querying data in Nepomuk pre-KDE-4.4 could be done in one of two ways: 1. Use the very limited capabilities of the ResourceManager to list resources with certain properties or of a certain type; or 2. Write your own SPARQL query using ugly QString::arg replacements.

With the introduction of Virtuoso and its awesome power we can now do pretty much everything in one query. This allowed me to finally create a query API for KDE: Nepomuk::Query::Query and friends. I won’t go into much detail here since I did that before.

All in all you should remember one thing: whenever you think about writing your own SPARQL query in a KDE application – have a look at libnepomukquery. It is very likely that you can avoid the hassle of debugging a query by using the query API.

The first nice effect of the new API (apart from me using it all over the place obviously) is the new query interface in Dolphin. Internally it simply combines a bunch of Nepomuk::Query::Term objects into a Nepomuk::Query::AndTerm. All very readable and no ugly query strings.

Dolphin Search Panel in KDE SC 4.4

Shared Desktop Ontologies

An important part of the Nepomuk research project was the creation of a set of ontologies for describing desktop resources and their metadata. After the Xesam project under the umbrella of freedesktop.org had been convinced to use RDF for describing file metadata they developed their own ontology. Thanks to Evgeny (phreedom) Egorochkin and Antonie Mylka both the Xesam ontology and the Nepomuk Information Elements Ontology were already very close in design. Thus, it was relatively easy to merge the two and be left with only one ontology to support. Since then not only KDE but also Strigi and Tracker are using the Nepomuk ontologies.

At the Gran Canaria Desktop Summit I met some of the guys from Tracker and we tried to come up with a plan to create a joint project to maintain the ontologies. This got off to a rough start as nobody really felt responsible. So I simply took the initiative and released the shared-desktop-ontologies version 0.1 in November 2009. The result was a s***-load of hate-mails and bug reports due to me breaking KDE build. But in the end it was worth it. Now the package is established and other projects can start to pick it up to create data compatible to the Nepomuk system and Tracker.

Today the ontologies (and the shared-desktop-ontologies package) are maintained in the Oscaf project at Sourceforge. The situation is far from perfect but it is a good start. If you need specific properties in the ontologies or are thinking about creating one for your own application – come and join us in the bug tracker

Timeline KIO Slave

It was at the Akonadi meeting that Will Stephenson and myself got into talking about mimicking some Zeitgeist functionality through Nepomuk. Basically it meant gathering some data when opening and when saving files. We quickly came up with a hacky patch for KIO and KFileDialog which covered most cases and allowed us to track when a file was modified and by which application. This little experiment did not leave that state though (it will, however, this year) but another one did: Zeitgeist also provides a fuse filesystem which allows to browse the files by modification dates. Well, whatever fuse can do, KIO can do as well. Introducing the timeline:/ KIO slave which gives a calendar view onto your files.

Tips And Tricks

Well, I thought I would mention the Tips And Tricks section I wrote for the techbase. It might not be a big deal but I think it contains some valuable information in case you are using Nepomuk as a developer.

Google Summer Of Code 2009

This time around I had the privilege to mentor two students in the Google Summer of Code. Alessandro Sivieri and Adam Kidder did outstanding work on Improved Virtual Folders and the Smart File Dialog.

Adam’s work lead me to some heavy improvements in the Nepomuk KIO slaves myself which I only finished this week (more details on that coming up). Alessandro continued his work on faceted file browsing in KDE and created:

Sembrowser

Alessandro is following up on his work to make faceted file browsing a reality in 2010 (and KDE SC 4.5). Since it was too late to get faceted browsing into KDE SC 4.4 he is working on Sembrowser, a stand-alone faceted file browser which will be the grounds for experiments until the code is merged into Dolphin.

Faceted Browsing in KDE with Sembrowser

Nepomuk Workshops

In 2009 I organized the first Nepomuk workshop in Freiburg, Germany. And also the second one. While I reported properly on the first one I still owe a summary for the second one. I will get around to that – sooner or later. ;)

CMake Magic

Soprano gives us a nice command line tool to create a C++ namespace from an ontology file: onto2vocabularyclass. It produces nice convenience namespaces like Soprano::Vocabulary::NAO. Nepomuk adds another tool named nepomuk-rcgen. Both were a bit clumsy to use before. Now we have nice cmake macros which make it very simple to use both.

See the techbase article on how to use the new macros.

Bangarang

Without my knowledge (imagine that!) Andrew Lake created an amazing new media player named Bangaranga Jamaican word for noise, chaos or disorder. This player is Nepomuk-enabled in the sense that it has a media library which lets you browse your media files based on the Nepomuk data. It remembers the number of times a song or a video has been played and when it was played last. It allows to add detail such as the TV series name, season, episode number, or actors that are in the video – all through Nepomuk (I hope we will soon get tvdb integration).

Edit metadata directly in Bangarang

Dolphin showing TV episode metadata created by Bangarang

And of course searching for it works, too...

And it is pretty, too...

I am especially excited about this since finally applications not written or mentored by me start contributing Nepomuk data.

Gran Canaria Desktop Summit

2009 was also the year of the first Gnome-KDE joint-conference. Let me make a bulletin for completeness and refer to my previous blog post reporting on my experiences on the island.

Well, that was by far not all I did in 2009 but I think I covered most of the important topics. And after all it is “just a blog entry” – there is no need for completeness. Thanks for reading.

About these ads

21 thoughts on “What We Did Last Summer (And the Rest of 2009) – A Look Back Onto the Nepomuk Development Year With an Obscenely Long Title

  1. Yay! Nice to see all the things that happened with Nepomuk last year. No wonder you’ve been so busy!

    Thanks for mentioning some of the nice side effects of Bangarang using nepomuk! Nepomuk is so nice to develop with (and we’re only just scratching the surface).

  2. Great think would be integrating Save dialog with special FUSE application to archiving file(some think like Apple timemachine).

  3. I am running KDE SC 4.3.95 right now but I cannot seem to find the search thing on Dolphin anywhere. How do I enable nepomuk based search in Dolphin? The timeline:/ kio slave works though.

    Btw, I am running 4.3.95 on openSUSE 11.2.

  4. Nice read. Nothing was really new to me, but browsing through this overview of what happened in Nepomuk land in 2009 is really astonishing.

    I am writing this from one of the two laptops running the second KDE SC 4.4 release candidate and I have to say that with its enormous increases in performance, reliability and accuracy, using Nepomuk has become a real pleasure. To me, it seems as if Nepomuk has reached the stage in which it can take off and become pervasive within KDE.

    Your work on this really is invaluable. I hope you get sufficient pleasure from it, despite the hickups here and there (hate mail, etc.).

    To a happy 2010!

  5. I am as well using the RC2 release of KDE SC 4.4 and must say that the Nepomuk just is fast.

    Just have small hickups what I can not point out what is causing them, but the data is not indexed realtime. Like if I save a file and annotate it or I annotate the webpage, I can not query them right away. But I am using Mandriva Cooker etc.

    This was very good overview of the nepomuk progress on last year. And now on KDE SC 4.4 it is working that we can show it to others.

    • Yes, Nepomuk has become extremely careful to not interfere with the user’s usage of the computer. In 4.3 it would grab lots of cpu and i/o and sometimes stall some apps completely. in 4.4 RC2, the user experience has become a breeze compared to what we had. But this remedy may have taken it just a little bit too far? I guess it depends on the system that is used and what the user expects from Nepomuk.

      My observation has been that in 4.4 RC2 the Strigi indexer is only started when I am not doing anything which leads to cpu or hard drive usage. As soon as I stop scrolling, opening files, etc., the indexer kicks in. If everything else is indexed already, a short time of inactivity should get those files into Nepomuk.

      Maybe a button or option “thorough indexing”/”force indexing”(?) in the Nepomuk systray applet should make it possible to have the user decide that Nepomuk can temporarily grap cpu cycles and i/o bandwidth so that new or modified files get added quickly?

      • That was a quick change which got pushed into rc2. I already changed that again and now strigi is not suspended on user interaction but slowed down.
        KDE SC 4.5 will have config options for that behaviour.

          • To extend that question, a database update about the file being opened/saved is neither executed right away nor queued for later but has to be picked up by Strigi separately later on? This would not seem like the optimal way to do it to me.

  6. Hey Sebastian,

    according to Planet KDE, you have 543 comments now on this article. That is pretty amazing.

    What did you do, give out free pizzas? ;)

    PS: Good article, I was just kidding (but the bit about the comments is true).

    Regards, Mark.

  7. Pingback: Links 27/1/2010: KDE 4.4 RC3, GNOME Foundation Adds Bradley Kuhn | Boycott Novell

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s