Semantic Save – Prototype

Before I go to bed let me quickly present the first prototype of the semantic save dialog:

I tried to take into account the great ideas you all gave me, the mockups provided by some of you (very nice work). Of course my current QWidget-based design does not allow to easily create some of the fancier things that were suggested but that can be done later on once the functionality is all there.

There is still a lot of work to be done but the first version is working and I can go to bed now.

A Million Ways To Do It Wrong

It is a sad truth: when it comes to creating data for the Nepomuk semantic desktop there are a million ways to do it wrong and basically only one way to get it right. Typically people will choose from the first set of ways. While that is of course bad they are not to blame. Who wants to read page after page of documentation and reference guide? Who wants to dive into the depth of RDF and all that ontology stuff when they just need to store a note (yes, this blog was inspired by a real problem). Nobody – that’s who! Thus, the Nepomuk API should do most of the work. Sadly it does not. It basically allows you to do everything. Resource::setProperty will happily use classes or even invalid URLs as properties without giving any feedback to the developer. Why is that? Well, I suppose there are at least three reasons: 1. Back in the day I figured people would almost always use the resource-generator to create their own convenience classes which handle the types and properties properly, 2. The Resource class is probably the oldest part of the whole Nepomuk stack, and 3. basic lack of time, drive and development power.

So what can we do about this situation? Vishesh and me have been discussing the idea of a central DBus API for Nepomuk data management a million times (as you can see today “a million” is my goto expression when I want to say “a lot”). So far, however, we could not come up with a good API that solves all problems, is future-proof (to a certain extend), and performs well. That did not change. I still do not know the solution. But I have some ideas as to what the API should do for the user in terms of data integrity.

  1. Ensure that only valid existing properties are used and provide a good error message in case a class or an invalid URL or something non-existing is used instead. This would also mean that one could only use ontologies that have been imported into Nepomuk. But since the ontology loader already supports fetching ontologies from the internet this should not be a big problem.
  2. Ensure that the ranges of the properties are honoured. This is pretty straight-forward for literal ranges. In that case we could also do some fancy auto-conversion to simplify the usage but in essence it is easy. The case of a non-literal ranges is a bit more tricky. Do we want to force proper types or do we assume that the object resource has the required type? I suppose flags would be of use:
    • ClosedWorld – It is required that the object resource has the type of the range. If it has not the call fails.
    • OpenWorld – The object resource will simply get the range type. This is not problem since resources can be of several types.

    This would also mean that each property needs to have a properly defined range. AFAIK this is currently not the case for all NIE ontologies. I think it is time for ontology unit tests!

  3. Automatically handle pimo:Things to a certain extend: Here I could imagine that trying to add a PIMO property on a resource would automatically add it to the related pimo:Thing instead.

Moving this from the client library into a service would have other benefits, too.

  • The service could be used from other languages than C++ or even from applications not using KDE.
  • The service could perform optimizations when it comes to storing triples, updating resources, caching, you name it.
  • The service could provide change notifications which are much more useful than the Soprano::Model signals which are pretty useless.
  • The service could perform any number of integrity tests before executing the actual commands on the database, thus improving the quality of the data in Nepomuk altogether.

This blog entry is not about presenting the one solution to solve all our problems. It is merely a brain-dump, trying to share some of the random thoughts that go through my head when taking a walk in the woods. Nonetheless this is an issue that needs tackling at one point or another. In any case my ideas are saved for the ages. :)

What happens if Mr. Nepomuk meets a bunch of Telepathyans?

A fun and very productive weekend is what happens!

Yesterday evening I came back from Cambridge where I attended the Telepathy-KDE sprint (note to self: never again fly with easyJet) which was smoothly organized by George Goldberg. A lot has already been said about the work at the sprint: Daniele “drdanz” Domenichelli provided us with nice pictures (I am looking really weird in the group photo), George Kiagiadakis gave a nice overview, and George G. himself spammed identi.ca with tons of comments on the sprint. Thus, obviously I will focus on the Nepomuk parts of the sprint.

Since George G. and, thus, Telepathy-KDE is one of the most fearless (as in: does not fear to try all the broken Nepomuk features and then ask me to fix them) Nepomuk users/developers he had a list of topics for me to look at. There was the issue that the query service did not scale since it created a separate thread for each query. I quickly fixed that using QThreadPool and a predefined number of query threads which made the contact list populate correctly.

Apart from that George has his own extensions to NCO which provide everything Telepathy-KDE needs to store all (!) its data in Nepomuk. He wanted me to review them again along with the data libktelepathy creates in Nepomuk. Most of it looks very good – but as I said George has been spending quite a lot of time understanding Nepomuk – not only at the two sprints. Thus, we should merge those extensions into NCO as soon as possible.

But his biggest problem was a missing GUI tool for debugging the data in Nepomuk. Thus, I sat down and fixed Nepomukshell. And as soon as I decided that it should be a developer tool and not an end user targeted application things got very easy. Nepomukshell is now “NepSaK – The Nepomuk Swiss Army Knife” (with a capital K because sometimes I like to be old-school) and has three modes: resource browsing, SPARQL querying, and resource editing. The code can still be found in playground – I will try to release that soon – and it depends only on the nepomukextras library. Since we all like looking at pictures I will show you some instead of explaining the features in detail.

It will eat your cat!

Browse resources via the class tree

Quick and dirty SPARQL querying - sopranocmd in a GUI

Edit resources without any clutter (or convenience for that matter)

This little tool should make life for us a bit easier. And it will probably grow over time providing all kinds of debugging and maintenance features.

Nepomuk Data Layout

I am happy to announce that I just finished writing an article about the data layout in Nepomuk. I think it is another must-read if you want to work with Nepomuk data, either querying it or creating data yourself. The article is another one in the series of techbase tutorials about Nepomuk. Well, this was short and fairly dry…

Reblog this post [with Zemanta]

The First Nepomuk Workshop – It’s a Wrap

The first Nepomuk workshop and the first KDE workshop held in Freiburg ever is over. It was great but short. I could have worked on with these guys for much longer. It was a lot of fun to explain the Nepomuk ideas directly and having people not only listening but also understanding and realizing them.

On Friday we started out slowly. Due to different travel times and also some stupidity from travel agencies and German bus drivers we were only complete at around sixish. To get in the mood I had everyone explain what they wanted to achieve over the weekend or what they thought could be interesting to work on with Nepomuk. The beginning was not easy, at least I feared that we would have trouble to actually getting to work. After all, you do not start to work with Nepomuk just like that. It is too confusing and different for that. But the ideas were very good and the people very interested and eager.

So on Saturday my fears of me not being able to handle it were vanished. I explained about PIMO (just to confuse everyone for real) and showed what I had done with respect to NLP in the Scribo project (Tom already mentioned it in his blog although he confused it with PIMO. No big deal, there are way too many project and technology names to get mixed up). Sebastian Faubel showed his very interesting work on a replacement for the Gnome open and save file dialog. He also uses RDF to store meta data and then based on that decides on a location for the documents in a fixed (not really fixed but based on a template) folder tree. After that coding began.

What did we do?

Well, I did not really code anything. There was no time for that. I was too busy discussing with and helping the others. And they did do cool stuff. Let me start by mentioning my hero of the weekend: Tobias König aka tokoe. He wanted to improve the performance of the Akonadi Nepomuk feeder agents which export contact and email meta data from Akonadi to Nepomuk. He did that by introducing a new fast mode into the nepomuk resource generator. Now he would not be tokoe if he would not have been shocked by the hack that is rcgen. So over the weekend he cleaned it up. He cursed, he sweated, he nearly went mad, but he did it! And since we defined it as a bug fix it is even in 4.3. Great work, Tobias.

Then there was Tom Albers. Now he already blogged about what he did himself. But I will summarize it anyway: he actually dared and integrated the Scribo-based annotation suggestions into Mailody. I will blog about details on that later. But the idea is that the email body is analysed and a plugin system generates possible annotations such as dates, cities, persons, and also possible events that are mentioned in the email. Not only did he integrate it into Mailody, he also found two bugs that would have been showstoppers for the Mandriva Scribo demo yesterday. So thanks a lot, Tom.

Raptor. Now Raptor is a cool project. Raptor sets out to replace or provide an alternative to Kickoff. And their idea for Nepomuk integration is to remember the launches of applications. For starters “only” when an application was launched. This then allows to show more frequently used applications with bigger icons. But it will not stop there. Application launches can be linked to the current context (or the current Plasma activity). I might use KPresenter at work all the time but never at home. Also it would be possible to link files to the application launch that was used to open them. And so on. Alessandro, Francesco, and Lukas quickly understood how to create and ontology and use it in KDE. They now have their dedicated application launch ontology and use it via the Nepomuk libs in Raptor. I hope that the ontology can at some point be made into somewhat of a standard in the desktop ontology project.

Daniel, while normally being an Amarok developer (he did the Nepomuk integration in the GSoC last year), was very eager on making Nepomuk really useful on the basic level. So we discussed handling of removable storage and nicer resource URI design a lot. In the end we decided:

  1. All files will have a random URI that never changes.
  2. All file systems will be represented in Nepomuk with their mount point and their mount status (Daniel already started working on the service that handles that. Also the tracker guys are working on something similar. Matching the ontologies should be fairly easy as the concepts are the same.)
  3. All file URLs will be relative.
  4. All files will have a link to their storing file system.

This helps in solving a bunch of problems. Files on removable storages like USB sticks which can be mounted in different places are handled the exact same way as files on local file systems. Moving a file in most cases only means to update one property: the relative URL. Only if the file is moved to another file system that link has to be changed, too. Through the mount state flag on the file system in Nepomuk it is very simple to see if a file is currently available or not. A search client can simply tell the user that they have to mount the file system in question to access the file. I think this is a fine solution and since I tried to design everything without relying on the file resource’s URI being the file’s URL the transition should be fairly simply. A goal for 4.4.

George already blogged about his Nepomuk integration work with Telepathy. What is so great about his work is that he actually uses PIMO in a productive way: one PIMO Person represents one person that can be contacted via Telepathy. And this one pimo:Person has a set of occurrences being the actual contacts like jabber accounts and so on. Thus, if you want to chat with a person you simply click their icon and Telepathy will open one of the available systems, depending on their online status. One could even think of email as a fallback. I find this especially interesting since it so clearly uses the two different layers of information defined via the PIMO ontology: on the lower level we have all the desktop resources like files and emails and jabber accounts and so on. And on a higher level we have all the real world entities like the actual person or a project or a city represented by PIMO concepts. I hope that we will see more integration like this is the future. (I know, I know, I need to write better documentation on this.)

Marcel worked on the Nepomuk integration in Digikam. Since the Digikam team does not want to entirely rely on Nepomuk yet (with it being optional and all) he created a Nepomuk service that keeps the Digikam database and Nepomuk in sync. So rating and tagging your images in Digikam would directly be reflected in Nepomuk and the other way around. Very nice. I hope to see this hitting a stable release soon.

I had hoped to have more time to work with Peter on the meta data display in Dolphin. But sadly that dropped under the table a bit. But I was at least able to show him my crappy formatting rule system I drafted a while back and we cleaned up the display a bit: nicer labels and less useless properties shown.

What did we look like?

What about the future?

I hope that we can do this again soon. I think it was really worth it and am very happy to have done it. Thanks again to all of you. You made this a successful event.

PS: I wanted to blog earlier but first I had to sleep for two days straight. I am too old for this shit! ;)