After a little silence during which I was occupied with Eastern and OpenLink related work I bring you news about the second Nepomuk task: the KActivityManager crash.
Ivan Cukic already “fixed” the bug by simply not using Nepomuk but an SQLite backend (at least that is how I understood it, correct me if I am wrong). However, I wanted to fix the root of the original problem.
Soprano provides the communication channel between Nepomuk and its clients. It is based on a very simple custom protocol going through a local socket. So far QLocalSocket, ie. Qt’s implementation was used. The problem with QLocalSocket is that it is a QObject. Thus, it cannot live in two threads at the same time. The hacky solution was to maintain one socket per thread. Sadly that resulted in complicated maintenance code which was impossible to get right. Hence crashes like #269573 or #283451 (basically any crash involving The Soprano::ClientConnection) were never fixed.
A few days ago I finally gave up and decided to get rid of QLocalSocket and replace it with my own implementation. The only problem is that in order to keep Windows compatibility I had to keep the old implementation around by adding quite a lot of #ifdefs.
And now I could use some testers for a Soprano client library that does only create a single connection to the server instead of one per thread. I already pushed the new code into Soprano’s git master. So all you need to do is run KDE on top of that.
Oh, and while at it I finally fixed the problem with re-connecting of clients. So now a restart of Nepomuk will no longer leave the clients with dangling connections, unable to perform queries. That fix, however, is in kdelibs.
Well, the day was long, I am tired, and this blog post feels a little boring. So before in addition to that it gets too long I will stop.
One Bug has been driving people crazy. This is more than understandable seeing that the bug was an endless high CPU usage by Virtuoso, the database used in Nepomuk. Kolab Systems, the Free Software groupware company behind Kolab, a driving force behind Akonadi, sponsored me to look into that issue.
Finding the issue turned out to be a bit harder than I thought, coming up with a fix even more so. In the process I ended up improving the Akonadi Nepomuk Email indexer/feeder in several places. This, however useful and worthwhile, turned out to be unrelated to the high CPU usage. Virtuoso was not to blame either. In the end the real issue was solved by a little SPARQL query optimization.
Application developers against Akonadi and Nepomuk might want to keep that in mind: The way you build your queries will have dramatic impact on the performance of the whole system. So this is also where opimizations are likely to have a lot of impact in case people want to help improve things further. Discussing query design with the Nepomuk team or on the Virtuoso mailing list can go a long way here.
So thanks to the support from Kolab Systems, Virtuoso is no longer chewing so much CPU, and Akonadi Email indexing will work a lot smoother with KDE 4.8.2.
This is just a little blog entry about the impact that the ontologies can have on functionality.
The ontologies are a set of vocabularies describing the types of resources stored in Nepomuk, the possible relations between these types, and the possible annotations. We have for example a type for local files, one for an address book entry, one for a person, one for music content and so on. We also have relations that describe that some person is the author or some piece of content and so on.
These ontologies are maintained in the Shared-Desktop-Ontologies project – to my knowledge the only real open-source project developing RDF ontologies.
Now to the actual topic. There once was a bug. Like so many other bugs it talked about file indexing in Nepomuk and like so many other bugs it said that some file could not be indexed. First it was Nepomuk’s fault, then it was the fault of libstreamanalyzer, but in the end I realized: there was a bug in the ontologies. More specificly in NMM – the Nepomuk MultiMedia ontology. (Granted this was not really the source of the hang the bug talks about but it was the reason the file could not be indexed.)
The problem was the domain of the nmm:setSize property. Each property has a domain and a range – the domain defines on which type of resource the property can be set, the range defines the type of the value. In other words they are defining the subject and object type of the triple. The domain is always a resource type (rdfs:Class), the range a resource or a literal type (typically one defined in the XML schema). In this case the domain of nmm:setSIze was set as nmm:MusicPiece whereas it should have been nmm:MusicAlbum. Thus, Nepomuk rejected the data generated by libstreamanalyzer as being invalid due to using an invalid domain. (Update: Nepomuk treats RDF data in a closed-world fashion. In comparison to the open-world approach which is typical for RDF/S resource types are not inferred from their relations. In an open-world situation the resource would simply end up being both a nmm:MusicPiece and a nmm:MusicAlbum.)
The solution is shared-desktop-ontologies 0.8.1 with the fixed domain. Installing it will make Nepomuk re-parse the changed ontology and indexing the mp3 files in question will finally work.
Well, this was pretty verbose for a rather small issue. Still it gave a little introduction into how the ontologies are used in Nepomuk. One more thing to take care of in the “Nepomuk universe”.
And as always:
There was not that much activity last week. That is simply because I took a few days off to spend time with my family on a farm – mostly so that my daughter could ride a pony every day.
Still, there are a few words I can say regarding Nepomuk bugs: 4 crashes have been reported for KDE 4.7.3. One of them I already fixed, one has a patch which is awaiting testing, one is very confusing and made me contact the mighty Dario for help, and the last one is the akonadi feeder which still has a memory leak. Apart from that blogging about my inotify usage had a very nice side-effect: Marcel Wiesweg from Digikam wants to use KInotfy and stumbled upon a serious bug which for some stupid reason I never caught. That bug means that files with umlauts and friends in their paths never keep their meta-data when moved. Urgh!
So KDE 4.7.4 will again contain a bunch of fixes. The hunt is not over yet. At least now all bugs are nicely categorized which makes triaging much easier.
And in other news: thank you so much for the amazing fundraiser. Only about 800€ to the goal which I feared was reaching for the stars. This is given me the energy to go on and try to find the required funding. No one stepped up so far but I am still hopeful as some are still “in discussion”. Keep your fingers crossed (and send ideas my way).
Now that KDE 4.7.3 has been released let me look back onto the work that I put into it over the last weeks.
- I fixed four actual crashes in 4.7.3. On first glance this might not sound like much but these four crash fixes entail 38 duplicates.
- I finally managed to close the memory leak in the file watcher service.
- I significantly improved the file indexer:
- Exclusion filters are now also correctly taken into account for folders.
- .xsession-errors is now always excluded from indexing.
- Rapidly changing files are only indexed once closed. This results in a lot less IO.
- The previous change also results in torrent downloads being indexed after finished.
- Files that are written over and over (like IRC logs for example) are only re-indexed once every five seconds.
- Nepomuk now always extracts the plain text from PDF files via pdftotext. This is a hack to make sure that we can at least search all PDFs by content. The next step will be to extract meta-data like title and author via poppler. (This is required since the PDF analyzer in libstreamanalyzer/Strigi is not powerful enough yet.)
- Symbolic links now have the correct mime type which means better search results.
- In case the indexer gets stuck (runs forever on one file) it is killed after a period of time.
- Jos van den Oever fixed a bunch of issues in libstreamanalyzer (Strigi) which results in less crashes and less endless PDF indexing. Stay tuned for Strigi 0.7.7.
- With Soprano 2.7.3 Nepomuk will now restart the storage if Virtuoso goes down due to a crash or a third-party kill.
- A running Virtuoso instance which was not shut down due to a crashed or killed Nepomuk will now gracefully be shut down before starting a new instance. This solves some startup issues.
- A small query performance improvement based on a pointless UNION.
- Smit Shah backported his patch which gets rid of the flickering Nepomuk indexer icon in the system tray. It now only becomes active if the indexer has been working for a certain period of time.
All in all 15 bugs are marked with “FIXED-IN: 4.7.3”. This does not include the fixes and improvements I made which did not have matching reports.
Today the next round of Nepomuk stability and performance begins. If all goes well KDE 4.7.4 should be rock-solid when it comes to Nepomuk. Thanks a lot for your continued support. I am still hopeful that I will find a more permanent solution soon:
… but it would not work. Not on my main system, not even in the VirtualBox which I only set up for that purpose. That is frustrating. There is a set of crash reports in bugs.kde.org for which I attached patches. But I cannot test them. Nepomuk simply does not want to crash for me!
Thus, I need your help. Anyone who manages to crash Nepomuk and is able to test a patch please have a look at those patches, bug 257176 even suggests that it might be prudent to apply more than one at the same time.
Anyway, I am now down to 111 open crash reports for which I did not request additional info yet (this is either: “does it still crash in 4.7.1” or “please test this patch”). That means a total of 140 crash reports have been closed since 17.09.2011. Most of them are in fact duplicates but that is tedious work also. ;)
As promised I will not stop until I closed all Nepomuk crashes on bugs.kde.org. And I do not mean closing as in “WONTFIX/INVALID”. I mean really close as in “FIXED” or “DUPLICATE”.
This is the first status report on that effort: 106 bugs closed thus far. Sure, 96 of those are duplicates, but nonetheless it is a good start. In addition to that I posted patches for 6 bugs which still need testing.
More to come soon.
Yes, sometimes it is. And sometimes it is a good thing that David Faure does not answer your pings because it makes you write test cases. And sometimes these test cases actually reveale the bug you have been hunting for months. And sometimes searching for the bug makes you refactor and simplify code in the process. This is exactly what happend with the annoying “reload bug” of the Nepomuk query KIO slave. It was responsible for results sometimes not showing up before hitting F5 a few times. Well, that is history. The present brings a better design using a QWaitCondition instead of a local event loop (which was ugly anyway and I have no idea what made me using it in the first place) which as a side effect also fixes the bug and simlifies the code. (And I mean “simplify”, not “making it simple”. The code is still far from simple.)
That’s already it. Just wanted to share that. More search goodies when I blog about Adam’s GSoC work.
This is how Soprano (and with it Nepomuk) goes down when run against Redland 1.0.9:
symbol lookup error: /usr/lib64/redland/librdf_storage_sqlite.so: undefined symbol: librdf_storage_register_factory
Redland 1.0.8 works without problems. This is mainly for informational purposes and to propose to distributors to downgrade. I will also post a bug report with Redland and try to figure out what the problem is. But in the meantime it might make sense to not live on the edge for once. ;)