Nepomuk Projects in GSoC 2011

After three weeks of parental leave after the birth of my second daughter (this should get me some comments even if the technical crap will not) and two weeks of catching up and preparing for the KDE 4.7 feature freeze I finally found the piece of mind to blog. Another year, another Google Summer of Code, and again Nepomuk got three students: Phaneendra Hegde, Smit Shah, and Swair Shah. (I suspected it with Vishesh, now I know: India loves Nepomuk!)

The three of them have great projects and I am very excited to see their results:

Phaneendra Hegde – Fancy Bookmarking

Phaneeddra’s project proposal outlines the development of a bookmarking tool for Konqueror and Rekonq, which allows users to link web pages to projects, tasks, people, files etc. and hence providing a effective bookmark management system. This is something I have wanted for quite some time. Instead of organizing your bookmarks in folders and sub-folders which grow impossibly big you relate the pages to things like people and projects, tag them, rate them, add comments, and so on. This allows to not only search for the web pages through the standard Nepomuk query facilities but also to browse your way to the wanted pages by means of tools like Ginkgo (more about that next time).

Smit Shah (Who) – Metadata Writeback

So far Nepomuk indexes all your files and lets you search or display that data. But changing it in Nepomuk did not trigger an update of the corresponding data in the file. Smit sets out to change that with his Metadata Writeback Project. The idea is rather simple: he will create a new service that hosts a new system of plugins that can write certain types of metadata. His first plugin will be based on TagLib allowing to update music file metadata directly in Nepomuk. This project should finally allow to use Nepomuk as a full-featured backend for applications like Amarok and Digikam.

Swair Shah – Nepomuk Project Integration

Swair will be working on Project Integration. Behind this simple title hides the idea to convert your desktop into a project management tool. The two main components are a project manager which allows to create and modify project resources that are stored in Nepomuk and a service that maintains the currently active project. The latter will allow to easily relate resources one is currently working with to the project and the other way around access resources related to the project. This integrates nicely with the already existing Nepomuk context service which maintains a single arbitrary resource as the current one.

Those are the projects for this year. While last year we also had three students but this year is very much different: Vishesh Handa steps in as second mentor and not only makes my life easier but also managed to get the three students interested in the first place. I cannot state this enough: Vishesh is a blessing for the Nepomuk project.

Nepo… (sorry) Google Summer Of Code 2011

Last year’s Summer of Code was a big success. Apart from two great projects Nepomuk gained the help of Vishesh Handa, the guy who made working on Nepomuk so much simpler. Hoping to repeat last year’s success I would like to remind students that there is another possibility to take part in Google’s Summer of Code this year: KDE has been accepted as a mentoring organization once again and Nepomuk is part of the project ideas line-up.

So far there is “only” four ideas but we will add more. Of course you can also come up with your own ideas for the Nepomuk semantic desktop. Please find us on the nepomuk@kde.org mailing list or on freenode IRC in the #nepomuk-kde chat room to discuss your ideas or to get feedback on questions.

But please be warned that there is already a bunch of people interested in the query parser idea. Thus, it might be in your interest (and mine) to have a look at one of the others or come up with one of your own.

Hope to see your contributions soon.

A Summer 2010 Full of Nepomuk Code

Another year, another Google Summer of Code, another 4 (yes, four!) semantic desktop projects. It is amazing. After two very successful projects in 2009 we now take it one step further with three Nepomuk projects and one Strigi project. Without further ado I give you the Nepomuk Google Summer of Code 2010 projects:

Metadata Backup, Sync and Sharing by Vishesh Handa

Ever since we started to create meta data on the desktop (by this I mean tags, ratings, and relations between resources that can not be recreated easily) we also had the need for backup and syncing of this data. So far this area is lacking in Nepomuk. Vishesh sets out to change this situation and develop ways to sync meta data between different clients (imagine syncing your laptop with the desktop computer or the phone) or simply to back it up. This does not simply mean to code some backup GUI – it actually includes changes on the ontology (the data-) level. When syncing data between two clients (or syncing data between a client and a backup – the principle is the same) the two most complicated matters are: 1. identifying the resources which need to be merged on both ends and 2. deciding which data needs to be removed and which to be added.

Well, it suffices to say that Vishesh has an ambitious project ahead of him. But looking at his enthusiasm and his early involvement in KDE (he is already commiting one patch after the other) I am very confident that he will succeed.

Web Metadata Extractor Framework and Service by Artem Serebriyskiy

In Nepomuk we use the Strigi system to extract meta data from files and store them in the Nepomuk database, allowing the user to search files based on their meta data. This is very useful. However, there are certain types of files that do not provide much or no meta data at all. Typical examples are video files. It would be very interesting to be able to search for video files by title, actors, directors, or release year. All this information is available on the Internet. So why not make use of it?

This is exactly what Artem’s project is about: extract meta data from the web and associate it with local files. Of course he will implement this as a Nepomuk service that provides a plugin system allowing for different types of extractors and being able to handle uncertainties and information duplicates as smoothly as possible. Look out for more cool information on your fingertips.

Nepomuk Dedicated Desktop Search GUI by Oszkar Ambrus

Let’s face it: today desktop search is still the number one use case for Nepomuk (although it was not the original motivation. But that is another story.) So having a good and convenient user interface is essential for the success of the system. We have several interfaces in KDE including the search bar in Dolphin and the search runner. But all are lacking in at least two main areas: 1. the query building: so far one has to know a lot about the underlying data structures to write powerful queries; and 2. the presentation of the search results: currently the results are presented like any other folder excluding interesting information like a hit score or details on why the result was returned. (Actually there is a number three which I hope Oszkar will have the time to attack: since we have more than file results we need a good way to open and present these resources.)

Oszkar sets out to improve this situation and create reusable components to let the user create powerful queries without much knowledge of the data and to present the results in a convenient way. An important project that will undoubtedly yield great results.

Strigi: Stream Analyzer based on Data Structure Descriptions

Jos was kind enough to write a paragraph on the Strigi project:

Yet another project has been granted. Yulia Medvedeva will work on a new type of file analyzer for Strigi. The goal of the project is to write the structure of files down in a grammar file and generate code from the grammar or parse the grammar at runtime. Writing analyzers usually involves quite a bit of repetitive error-prone code. It also requires knowledge of C++. By writing the format in a grammar language, coding errors are avoided. In adddition to that, the independence of the programming language allows the grammars to be shared with other projects.

Well, that is it for the four projects that should give Nepomuk a good push forward. I am very happy about the selection and have to say thank you to Google and the rest of the KDE mentor team for giving us this much support. It will be legendary!

GSoC Wrap-up Part 2

Last time I presented the work Adam Kidder did on Nepomuk virtual folders in the GSoC. Today the story continues with the work by Alessandro Sivieri, my second GSoC student.

Whenever we handle files on the computer we need to bother with folder structures and file names. We need to come up with good naming schemes which allow us to find our files. We need to decide several times a day in which folder a file should go – should it go into folder A or B or should I create a subfolder? In the end there is always a little bit of chaos, even with the most structured minds. Alessandro tried a different approach in his project: save and load documents based on meta data and annotations rather than file and folder names.

This is not an easy task but I dare say that he succeeded. Alessandro created two new dialogs for saving and loading documents (we do not talk about files anymore – way too technical). The saving dialog allows to create arbitrary annotations for the document using the Nepomuk annotation plugin system which also brings in Scribo text analysis features. The loading dialog on the other hand uses a fancy filter system to narrow down the list of documents to open.

Saving Documents

We start by looking at the document saving dialog. Our example is KWord from which we want to save a fancy little text document. (No, it is not a test document, I really wrote this, this is real data, I assure you! … Yeah, OK, I admit it, just random words…) Hitting the save button opens up the new smart save dialog as can be seen in the screenshot below.

Smart Saving of a KWord document

Smart Saving of a KWord document

The first thing we notice is that there is no filename and no folder selection. Name and folder are selected by Nepomuk. However, we get to give the document a name (it makes things much easier for us later on) and a description (in a future version applications will be able to prefill these fields with some meaningful defaults). But the interesting part is the meta data. The dialog suggests certain possible annotations which we can approve). Below the recently used annotations we have the possiblity to add any annotation we want through the existing Nepomuk annotation system. Last but not least we can give the document a type. This type does not identify the document on a mime-type level but much more real-life oriented. The idea is that users either define their own types based on pimo:Document or use ontologies that provide them. Typical examples include invoices or letters or project descriptions. This way documents are saved on a much higher abstraction level than with the classical file chooser: instead of a text file we save an invoice.

Once we specified the meta data we want to apply to the new document and hit the save button the smart save dialog generates a folder and file name and saves the document. We do not need to care about the location.

(Hint: there are certainly situations in which we want to use the classic file chooser. That is why the smart save dialog allows to switch over to the old ways by the simple click of a button.)

Loading Documents

But if documents are saved in some random folder which we do not know, how do we find them again? Well, that is the real beauty of the new approach. The idea is that you tell the open dialog what you want to open by specifying some details that you remember.

Let us have a look at the smart open dialog as it opens from within Okular.

smartopen-okular1

We see two main views: on the left hand side we see a list of filters and on the right hand side we see a long list of files/documents. This might look overwhelming in the beginning but wait until we specify the first detail about the document we want to open: we tell the dialog that the document has mime type image/png (Yes, in the future this will look less technical) and the file view changes only showing png images.

smartopen-okular2

These are still way too many to search for the one we need, so we give more detail. We remember that we accessed the document sometime this week:

smartopen-okular3

Again the list of files is changed and now after only choosing two filters we are down to seven documents to choose from. Although this would be enough we do one better just to show that the filter system obviously also includes manual annotations such as tags:

smartopen-okular4

And after activating the tag filter we are down to a single document. Nice, isn’t it?

A Few Technical Details

There are a few technical aspects worth mentioning about Alessandro’s work.

First of all: he makes direct use of Adam’s work on the virtual folders. The file list on the right is a simple KDirModel listing a nepomuksearch:/?sparql=… query. I find this very nice as my two students shared knowledge and discussed their work to find good solutions to their problems.

The second thing I find important is the creation of the filter list. The list of filters is created dynamically based on the existing annotations of the files in the current selection. In essence the idea is to only show filters that would actually change the list of available files (as you can see in the last screenshot this does not work 100% yet but we are close).

The GUI is obviously a prototype and we hope that you will give feedback and ideas to improve its usability. As Adam, Alessandro will continue working on KDE and Nepomuk and the smart file dialog will evolve until KDE 4.4.

Try It

To test the smart file dialog you need three things:

  1. My kdelibs patch which makes the KFileDialog pluggable. This is actually a very simple one as the file dialog already loads the backend from a separate lib. While you are on it, please review the patch so it can get into KDE 4.4.
  2. The Nepomuk-KDE playground module which also contains the smart save dialog. I recommend installing the whole module as the smart save dialog makes use of pretty much every Nepomuk lib available.
  3. Tell KFileDialog to load the smartfilemodule instead of the default by adding “file module=smartfilemodule” into the “KFileDialog Settings” group of kdeglobals.

Obviously nepomuk needs to be enabled for it to work. Have fun.

GSoC Wrap-up Part 1

This year’s Google Summer of Code has ended. And it was a great success!

This year I had the pleasure to mentor two outstanding students: Adam Kidder and Alessandro Sivieri. Working with them was fun and rewarding. Both quickly understood what Nepomuk was all about and provided high quality work. I am very happy about that. Even more so since both of them plan to continue working on KDE and Nepomuk. Thus, I can only repeat myself: a great success.

Enough of the euphoria. Let us dive into the good stuff and start with Adam’s project:

Improved Virtual Folders

We have had the virtual folder KIO slave in KDE for quite some time now. But it was one big hack I threw together and always had its hickups, not to mention the lack of features. Adam took the project of improving the situation by making it more stable, introducing new features such as negated terms and relative dates, and providing a GUI for query creation. I can assure you that this was no easy task. Diving into the messy code I produced both for the Nepomuk query service and the search KIO slave Adam needed nerves of steel. But he proved himself by understanding and sorting out the mess and introducing a bunch of nice features.

Relative Dates

One of the nicest thing Adam implemented is the support for relative date in queries. By relative dates I mean for example yesterday as you can see in the following screenshot:

Virtual Folder using a relative date

Virtual Folder using a relative date

Another possible relative date is “a week ago” which can of course also be combined with other query terms:

gsoc-virtfolders-last-weekApart from relative dates Adam implemented

Negated Query Terms

Using a minus sign as the negation prefix we can exclude certain query terms:

Querying for one tag

Querying for one tag

Excluding another tag

Excluding another tag

Very useful and mandatory for any search engine.

One thing I personally find very important is the possibility to use

Sparql Queries in the KIO slave

This allows to use the KIO slave to list arbitrary query results (as long as its only resources) and list them in Dolphin or even use a KDirModel to list resources in any application.

Listing Nepomuk Tasks via the Search KIO slave

Listing Nepomuk Tasks via the Search KIO slave

Now let us have a look at the

GUI

Due to the complexity of Adam’s project’s code he did not get as far with the GUI as he would have liked. But as mentioned already he will continue to work on it and integrate it into Dolphin nicely. Anyway, so far we have a small query creator which allows to save queries that are then displayed in the nepomuksearch:/ main folder.

Editing a query in the simple query editor

Editing a query in the simple query editor

Try it

If you want to test Adam’s new features before they are merged into trunk you need to install his work branch which replaces a few files installed by kdebase-runtime. The query editor is still part of the Nepomuk playground module. It is not enabled in the build system of the whole module, it needs to be built independantly.

That’s it for now. Next up: Alessandro’s smart file dialog.

Reblog this post [with Zemanta]

Fixing Bugs is Fun

Yes, sometimes it is. And sometimes it is a good thing that David Faure does not answer your pings because it makes you write test cases. And sometimes these test cases actually reveale the bug you have been hunting for months. And sometimes searching for the bug makes you refactor and simplify code in the process. This is exactly what happend with the annoying “reload bug” of the Nepomuk query KIO slave. It was responsible for results sometimes not showing up before hitting F5 a few times. Well, that is history. The present brings a better design using a QWaitCondition instead of a local event loop (which was ugly anyway and I have no idea what made me using it in the first place) which as a side effect also fixes the bug and simlifies the code. (And I mean “simplify”, not “making it simple”. The code is still far from simple.)

That’s already it. Just wanted to share that. More search goodies when I blog about Adam’s GSoC work.

Reblog this post [with Zemanta]