About Sebastian Trüg

Working hard on the Semantic Desktop for KDE which is called Nepomuk and the CD/DVD burner K3b. However, this blog is mostly about Nepomuk.

TPAC 2012 – Who Am I (On the Web)?

Last week I attended my first TPAC ever – in Lyon, France. Coming from the open-source world and such events like Fosdem or the ever brilliant Akademy I was not sure what to expect. Should I pack a suite? On arrival all my fears were blown away by an incredibly well organized event with a lot of nice people. I felt very welcome as a newbie, there was even a breakfast for the first-timers with some short presentations to get an overview of the W3C‘s work in general and the existing working groups. So before getting into any details: I would love this to become a regular thing (not sure it will though, seeing that next year the TPAC will be in China).

My main reason for going to the TPAC was identity on the Web, or short WebID. OpenLink Software is a strong supporter of the WebID identification and authentication system. Thus, it was important to be present for the meeting of the WebID community group.

The meeting with roughly 15 people spawned some interesting discussions. The most heatedly debated topic was that of splitting the WebID protocol into two parts: 1. identification and 2. authentication. The reason for this is not at all technical but more political. The WebID protocol which uses public keys embedded in RDF profiles and X.509 certificates which contain a personal profile URL has always had trouble being accepted by several working groups and people. So in order to lower the barrier for acceptance and to level the playing field the idea was to split the part which is indisputable (at least in the semantic web world) from the part that people really have a problem with (TLS).

This lead to a very simple definition of a WebID which I will repeat in my own words since it is not written in stone yet (or rather “written in spec”):

A WebID is a dereferencable URI which denotes an agent (person, organization, or software). It resolves to an RDF profile document uniquely identifying the agent.

Here “uniquely identify” simply means that the profile contains some relation of the WebID to another identifier. This identifier can be an email address (foaf:mbox), it can be a Twitter account, an OpenID, or, to restore the connection to the WebID protocol, a public key.

The nice thing about this separation of identity and authentication is that the WebID is now compatible with any of the authentication systems out there. It can be used with WebID-Auth (this is how I call the X.509 certificate + public key in agent profile system formally known as WebID), but also with OpenID or even with OAuth. Imagine a service provider like Google returning a WebID as part of the OAuth authentication result. In case of an OpenID the OpenID itself could be the WebID or another WebID would be returned after successful authentication. Then the client could dereference it to get additional information.

This is especially interesting when it comes to WebACLs. Now we could imagine defining WebACLs on WebIDs from any source. Using mutual owl:sameAs relations these WebIDs could be made to denote the same person which the authorizing service could then use to build a list of identifiers that map the one used in the ACL rule.

In any case this is a definition that should pose no problems to such working groups as the Linked Data Protocol. Even the OpenID or OAuth community should wee the benefits of identifying people via URIs. In the end the Web is a Web of URIs…

And Now For Something Completely Different: Resizable Bootstrap Modals

(Update: Now with better overflow handling.)

I have been doing more CSS + JS as I would have liked these past months. I am still a newbie but today I did something which I am happy enough about to make it into a short blog post.

Twitter’s Bootstrap CSS framework is a nice basis to get your web page going. It also has modal windows. JQuery has the resizable function which allows to add resize handles to any div. In order to make this work with a Bootstrap modal I did the following things:

1. Slightly change the positioning of the jQuery resize handles so they do not force scrollbars on the Bootstrap modal and do not add a weird border:

.ui-resizable-s {
  bottom: 0;
}
.ui-resizable-e {
  right: 0;
}

2. A little bit of JS magic to properly reposition the modal once it has been resized:

$(".modal").on("resize", function(event, ui) {
    ui.element.css("margin-left", -ui.size.width/2);
    ui.element.css("margin-top", -ui.size.height/2);
    ui.element.css("top", "50%");
    ui.element.css("left", "50%");

    $(ui.element).find(".modal-body").each(function() {
      $(this).css("max-height", 400 + ui.size.height - ui.originalSize.height);
    });
});

Just look at the Bootstrap modal’s margin in Firebug or something similar to understand the first two statements. The top and left values are to fix the positioning in Opera. The last statement is to also resize the modal body for proper overflow handling.

Now just enable resize on all modals or on whatever modal you want:

$(".modal").resizable();

Digitally Sign Emails With Your X.509 Certificate in Evolution

Digitally signing Emails is always a good idea. People can verify that you actually sent the mail and they can encrypt emails in return. A while ago Kingsley showed how to sign emails in Thunderbird.I will now follow up with a short post on how to do the same in Evolution.

The process begins with actually getting an X.509 certificate including an embedded WebID. There are a few services out there that can help with this, most notably OpenLink’s own YouID and ODS. The former allows you to create a new certificate based on existing social service accounts. The latter requires you to create an ODS account and then create a new certificate via Profile edit -> Security -> Certificate Generator. In any case make sure to use the same email address for the certificate that you will be using for email sending.

The certificate will actually be created by the web browser, making sure that the private key is safe.

If you are a Google Chrome user you can skip the next step since Evolution shares its key storage with Chrome (and several other applications). If you are a user of Firefox you need to perform one extra step: go to the Firefox preferences, into the advanced section, click the “Certificates” button, choose the previously created certificate, and export it to a .p12 file.

Back in Evolution’s settings you can now import this file:

To actually sign emails with your shiny new certificate stay in the Evolution settings, choose to edit the Mail Account in question, select the certificate in the Secure MIME (S/MIME) section and check “Digitally sign outgoing messages (by default)“:

The nice thing about Evolution here is that in contrast to Thunderbird there is no need to manually import the root certificate which was used to sign your certificate (in our case the one from OpenLink). Evolution will simply ask you to trust that certificate the first time you try to send a signed email:

That’s it. Email signing in Evolution is easy.

Virtuoso 6.1.6 and KDE 4.9

Shortly after KDE 4.9 hits the net Virtuoso 6.1.6 follows. Virtuoso 6.1.6 comes with a ton of fixes, improvements and optimizations and it is highly recommended to update for the best Nepomuk experience.

Virtuoso 6.1.6 has been tested by the Nepomuk team in cooperation with OpenLink Software before its release. It is the recommended release for Nepomuk. This is not only true for KDE 4.9 but for any version before it.

Get the sources while they are hot and build your packages.

Debugging Nepomuk/Virtuoso’s CPU usage

Rabauke wrote a very good blog post about debugging Nepomuk and Virtuoso query performance on OpenSuse.

Also David Faure posted a Virtuoso patch on the Nepomuk mailing list which makes the Virtuoso status() command output the full queries instead of the truncated ones. I will try to get the latter into Virtuoso upstream, maybe with an additional parameter to the function.

Nepomuk Tasks: KActivityManager Crash

After a little silence during which I was occupied with Eastern and OpenLink related work I bring you news about the second Nepomuk task: the KActivityManager crash.

Ivan Cukic already “fixed” the bug by simply not using Nepomuk but an SQLite backend (at least that is how I understood it, correct me if I am wrong). However, I wanted to fix the root of the original problem.

Soprano provides the communication channel between Nepomuk and its clients. It is based on a very simple custom protocol going through a local socket. So far QLocalSocket, ie. Qt’s implementation was used. The problem with QLocalSocket is that it is a QObject. Thus, it cannot live in two threads at the same time. The hacky solution was to maintain one socket per thread. Sadly that resulted in complicated maintenance code which was impossible to get right. Hence crashes like #269573 or #283451 (basically any crash involving The Soprano::ClientConnection) were never fixed.

A few days ago I finally gave up and decided to get rid of QLocalSocket and replace it with my own implementation. The only problem is that in order to keep Windows compatibility I had to keep the old implementation around by adding quite a lot of #ifdefs.

And now I could use some testers for a Soprano client library that does only create a single connection to the server instead of one per thread. I already pushed the new code into Soprano’s git master. So all you need to do is run KDE on top of that.

Oh, and while at it I finally fixed the problem with re-connecting of clients. So now a restart of Nepomuk will no longer leave the clients with dangling connections, unable to perform queries. That fix, however, is in kdelibs.

Well, the day was long, I am tired, and this blog post feels a little boring. So before in addition to that it gets too long I will stop.

Nepomuk Tasks: Let The Virtuoso Inferencing Begin

Only four days ago I started the experiment to fund specific Nepomuk tasks through donations. Like with last year’s fundraiser I was uncertain if it was a good idea. That, however, changed when only a few hours later two tasks had already reached their donation goal. Again it became obvious that the work done here is appreciated and that the “open” in Open-Source is understood for what it actually is.

So despite my wife not being overly happy about it I used the weekend to work on one of the tasks: Virtuoso inferencing.

Inference?

As a quick reminder: the inferencer automatically infers information from the data in the database. While Virtuoso can handle pretty much any inference rule you throw at it we stick to the basics for now: if resource R1 is of type B and B derives from A then R1 is also of type A. And: if R1 has property P1 with value “foobar” and P1 is derived from P2 then R1 also has property P2 with value “foobar“.

Crappy Inference

This is already very useful and even mandatory in many cases. Until now we used what we called “crappy inferencing 1 & 2″. The Crappy inferencer 1 was based on work done in the original Nepomuk project and it simply inserted triples for all sub-class and sub-property relations. That way we could simulate real inference by querying for something like

select * where {
  ?r ?p "foobar" . 
  ?p rdfs:subPropertyOf rdfs:label .
}

and catch all sub-properties of rdfs:label like nao:prefLabel or nie:title. While this works it means bad performance, additional storage and additional maintenance.

The Crappy Inferencer 2 was even worse. It inserted rdf:type triples for all super-classes. This means that it would look at every added and removed triple to check if it was a rdf:type triple. If so it would add or remove the appropriate rdf:type triples for the super-types. That way we could do fast type queries without relying on the crappy inferencer 1 which relies on the rdfs:subClassOf method. But this meant even more maintenance and even more storage space wasted.

Introducing: Virtuoso Inference

So now we simply rely on Virtuoso to do all that and it does such a wonderful job. Thanks to Virtuoso graph groups we can keep our clean ontology separation (each ontology has its own graph) and still stick to a very simple extension of the queries:

DEFINE input:inference <nepomuk:/ontographgroup>
select * where {
  ?r rdfs:label "foobar" .
}

Brilliant. Of course there are still situations in which you do not want to use the inferencer. Imagine for example the listing of resource properties in the UI. This is what it would look like with inference:

We do not want that. Inference is intended for machine, not for the human, at least not like this. So since back in the day I did not think of adding query flags to Soprano I simply introduced a new virtual query language: SparqlNoInference.

Resource Visibility

While at it I also improved the resource visibility support by simplifying it. We do not need any additional processing anymore. This again means less work on startup and with every triple manipulation command. Again we save space and increase performance. But this also means that resource visibility filtering will not work as before anymore. Nepoogle for example will need adjustment to the new way of filtering. Instead of

?r nao:userVisible 1 .

we now need

FILTER EXISTS { ?r a [ nao:userVisible "true"^^xsd:boolean ] }

Testing

The implementation is done. All that rests are the tests. I am already running all the patches but I still need to adjust some unit tests and maybe write new ones.

You can also test it. The code changes are, as always, spread over Soprano, kdelibs and kde-runtime. Both kdelibs and kde-runtime now contain a branch “nepomuk/virtuosoInference”. For Soprano you need git master.

Look for regressions of any kind so we can merge this as soon as possible. The goal is KDE 4.9.

Akonadi, Nepomuk, and A Lot Of CPU

One Bug has been driving people crazy. This is more than understandable seeing that the bug was an endless high CPU usage by Virtuoso, the database used in Nepomuk. Kolab Systems, the Free Software groupware company behind Kolab, a driving force behind Akonadi, sponsored me to look into that issue.

Finding the issue turned out to be a bit harder than I thought, coming up with a fix even more so. In the process I ended up improving the Akonadi Nepomuk Email indexer/feeder in several places. This, however useful and worthwhile, turned out to be unrelated to the high CPU usage. Virtuoso was not to blame either. In the end the real issue was solved by a little SPARQL query optimization.

Application developers against Akonadi and Nepomuk might want to keep that in mind: The way you build your queries will have dramatic impact on the performance of the whole system. So this is also where opimizations are likely to have a lot of impact in case people want to help improve things further. Discussing query design with the Nepomuk team or on the Virtuoso mailing list can go a long way here.

So thanks to the support from Kolab Systems, Virtuoso is no longer chewing so much CPU, and Akonadi Email indexing will work a lot smoother with KDE 4.8.2.