Just another way of browsing your files

October 26, 2009 15 comments

The Zeitgeist guys created a fuse file system called zeitgeistfs. It is basically a calendar containing the files accessed at that specific date. So at the Akonadi meeting last weekend, having two hours to kill, I thought that should be doable with KIO. So two hours later (most of that time was spent twiddling with UDS entries) the timeline:/ KIO slave was up and running:

The code that actually does something is minimal: a bit of UDS entry creation for dates and a simple SPARQL query to forward to the nepomuksearch KIO slave. Yes, it is as easy as that since we can simply set the UDS_URL property of an item to a nepomuksearch URL and KIO will take care of the rest. Smooth. Thanks a lot David Faure. Once again you paved the way.

OK then. Try it if you like. This is just another example of what can be done with Nepomuk (no real semantics here though). The code is in the playground as always and is based on the current kdebase trunk. So with KDE 4.3 this baby won’t work. And of course as this is based on file meta data, the Nepomuk Strigi integration has to be enabled.

SPARQL is weird…

October 26, 2009 9 comments

It really is. The following query is the only way I found to exclude folders when looking for files:

select ?r where {
   ?r a nfo:FileDataObject .
   OPTIONAL { ?r2 a nfo:Folder . FILTER(?r = ?r2) . } .
   FILTER( !BOUND(?r2) ) .
}

Now to me this looks really weird. And maybe I am simply not seeing the wood for all the trees…

Virtuoso – Once More With Feeling

October 22, 2009 16 comments

The Virtuoso backend for Soprano and, thus, Nepomuk can be seen as rather stable now. So now the big tests can begin as the goal is to make it the standard in KDE 4.4. Let me summarize the important information again:

Step 1

Get Virtuoso 5.0.12 from the Sourceforge download page. Virtuoso 6 is NOT supported. (not yet anyway)

Step 2

Hints for packagers: Soprano only needs two files: the virtuoso-t binary and the virtodbc_r(.so) ODBC driver. Everything else is optional. (For the self-compiling folks out there: –disable-all-vads is your friend.)

Step 3

Install libiodbc which is what the Soprano build will look for (Virtuoso is simply a run-time dependency.)

Step 4

Rebuild Soprano from current svn trunk (Remember: Redland is still mandatory. Its memory storage is used all over Nepomuk!)

Step 5

Edit ${KDEHOME}/share/config/nepomukserverrc with your favorite editor. In the “[Basic Settings]“ section add “Soprano Backend=virtuosobackend”. Do not touch the main repository settings!

Step 6

Restart Nepomuk. I propose the following procedure to gather debugging information in case something goes wrong:
Shutdown Nepomuk completely:

 # qdbus org.kde.NepomukServer /nepomukserver org.kde.NepomukServer.quit

Restart it by piping the output into a temporary file (bash syntax):

 # nepomukserver 2> /tmp/nepomuk.stderr

Step 7

Wait for Nepomuk to convert your data. If you are running KDE trunk you even get a nice progress bar in the notification area (BTW: does anyone know why it won’t show the title?)

And Now?

That is already it. Now you can enjoy the new Virtuoso backend.

The development has taken a long time. But I want to thank OpenLink and especially Patrick van Kleef who helped a lot by fixing the last little tidbits in Virtuoso 5 for my unit tests to pass. Next step is Virtuoso 6.

And Yet Another Post About Virtuoso

October 14, 2009 6 comments

Today nearly all problems are solved. OpenLink provided a patch that makes inserting very large literals (more than 1 metabyte in size) lightning fast, even with a very low buffer count. Also I worked around the issue of URI encoding. Now the Soprano Virtuoso backend simply percent-encodes all non-unreserved characters and all reserved characters that are not used in their special meaning in URIs used in queries. Man, that is a mouth full. Well, it seems to work fine although I can always use more testing with weird file URLs (weird means containing weird characters like brackets and the likes). I also fixed some error handling bugs.

So what is left? Well, there are a few hacks in the Virtuoso backend which are rather ugly. One example is the detection of query result types. To determine if the result is boolean, bindings, or a graph it actually checks the name and number of result columns. Urgh! It would be nicer to check for the type of the result. Seems like graph results are BLOBs.

Anyway, enough for tonight. I am tired. Here is the patch to make Virtuoso not hang when Strigi adds nie:PlainTextContent literals of big files:

Index: sqlrcomp.c
===================================================================
RCS file: virtuoso-opensource/libsrc/Wi/sqlrcomp.c,v
retrieving revision 1.9
diff -u -r1.9 sqlrcomp.c
--- sqlrcomp.c  20 Aug 2009 17:47:22 -0000      1.9
+++ sqlrcomp.c  13 Oct 2009 16:11:49 -0000
@@ -65,7 +65,7 @@
 {
 va_list list;
 char temp[2000];
-  int ret;
+  int ret, rest_sz, copybytes;
 va_start (list, string);
 ret = vsnprintf (temp, sizeof (temp), string, list);
 #ifndef NDEBUG
@@ -75,11 +75,16 @@
 va_end (list);
 #ifndef NDEBUG
 if (*fill + strlen (temp) > len - 1)
-    GPF_T1 ("overflow in strncpy");
+    GPF_T1 ("overflow in memcpy");
 #endif
-  strncpy (&text[*fill], temp, len - *fill - 1);
+  rest_sz = (len - fill[0]);
+  if (ret >= rest_sz)
+    copybytes = ((rest_sz > 0) ? rest_sz : 0);
+  else
+    copybytes = ret+1;
+  memcpy (text+fill[0], temp, copybytes);
 text[len - 1] = 0;
-  *fill += (int) strlen (temp);
+  fill[0] += ret;
 }

Virtuoso – for real!

October 9, 2009 24 comments

used-bckend-virtuoso

Soprano 2.3.63 – that is the magic version number you need to look out for.

And then once you have updated your kdebase copy to the latest trunk you run your favorite text editor on ~/.kde/share/config/nepomukserverrc. In there you set Soprano Backend=virtuosobackend in the [Basic Settings] section. After that you simply restart Nepomuk as described in the corresponding howto. You can also logout and log back in again but then you won’t be able to provide as nice bug reports.

Once done Nepomuk will convert your database. This can take a loooong time if strigi is enabled. But it will finish. :)

BTW: You need a recent snapshot of Virtuoso 5.0.12 for this to work.

Nepomuk Development – You Should Get Into It

October 8, 2009 11 comments

So far I have had trouble getting people on board Nepomuk development. I have been told that it has to do with my lack of communicating the problems and the TODOs. So now I am trying to change that. The Nepomuk project page is not new but for some reason I have failed to blog about it properly yet. Well, here goes: Check out the Nepomuk project page with its TODO and ideas list to give you an idea of how to start with Nepomuk development.

And once again the overstated version:

Get into Semantic Desktop Development today and help shaping the future of the desktop in general!

Thank you.

Virtuoso

October 7, 2009 22 comments

We are nearly there:

sesame2-virtuoso-convert-notification

Aggregating Nepomuk

October 5, 2009 9 comments

Recently there have been some posts on Nepomuk in KDE. Tobias König blogged about how to Pimp my Nepomuk. He explains how for many users redland is still the default backend and how to change that. He gives the most important pointers on how to enable the java-based sesame2 Soprano backend. Thomas McGuire gives a very good introduction into what Soprano, Nepomuk, Strigi, and Akonadi are and how they relate. This was a much needed post. Thank you for that, Thomas! And finally mat69 gives his ideas on how to improve the desktop search experience with Nepomuk. He has some good ideas that should really be implemented.

Can somebody please tell me how to get 40 hours out of the work-day? That would really help! ;)

A Bit Of Nepomuk Goodness On Your Developer Fingertips

September 22, 2009 Leave a comment

And yet another technical blog entry. This time it concerns the latest improvements in sopranocmd (Soprano >= 2.3.61). For starters there is an improved NRLModel which provides automatic query prefix expansion. What does that mean? Well, it means that debugging Nepomuk data is simpler now as you can simply use all ontologies stored in the Nepomuk database without defining their prefixes. With sopranocmd (I am using nepomukcmd, a little alias I introduce in the Nepomuk Tips and Tricks) this feature is enabled via the –nrl parameter. Thus, querying all tags becomes:

nepomukcmd --nrl query "select ?r where { ?r a nao:Tag . }"

The second new thing is an improved import command. Again enabled with the –nrl parameter it creates a new named graph of type nrl:KnowledgeBase and puts all new statements (which are not in a graph yet) into it. As described in Nepomuk Data Layout it also adds a metadata graph and the creation date. It actually makes use of NRLModel::createGraph.

The reason I did this was to be able to migrate the tmo:Task instances I had created on the laptop to my desktop machine. Just as an example I will show the procedure here:

First I export all the tasks and their properties on the laptop:

nepomukcmd --nrl export "describe ?r where { ?r a tmo:Task . }" \
     /tmp/task-dump.n4

And then on the desktop I simply import them into Nepomuk:

nepomukcmd --nrl import /tmp/task-dump.n4

Update: You can also use query prefixes for statement listing now. Thus the following is now possible:

nepomukcmd --nrl list "" a tmo:Task

(Even the “a” keyword now maps to rdf:type.)

GSoC Wrap-up Part 2

August 28, 2009 53 comments

Last time I presented the work Adam Kidder did on Nepomuk virtual folders in the GSoC. Today the story continues with the work by Alessandro Sivieri, my second GSoC student.

Whenever we handle files on the computer we need to bother with folder structures and file names. We need to come up with good naming schemes which allow us to find our files. We need to decide several times a day in which folder a file should go – should it go into folder A or B or should I create a subfolder? In the end there is always a little bit of chaos, even with the most structured minds. Alessandro tried a different approach in his project: save and load documents based on meta data and annotations rather than file and folder names.

This is not an easy task but I dare say that he succeeded. Alessandro created two new dialogs for saving and loading documents (we do not talk about files anymore – way too technical). The saving dialog allows to create arbitrary annotations for the document using the Nepomuk annotation plugin system which also brings in Scribo text analysis features. The loading dialog on the other hand uses a fancy filter system to narrow down the list of documents to open.

Saving Documents

We start by looking at the document saving dialog. Our example is KWord from which we want to save a fancy little text document. (No, it is not a test document, I really wrote this, this is real data, I assure you! … Yeah, OK, I admit it, just random words…) Hitting the save button opens up the new smart save dialog as can be seen in the screenshot below.

Smart Saving of a KWord document

Smart Saving of a KWord document

The first thing we notice is that there is no filename and no folder selection. Name and folder are selected by Nepomuk. However, we get to give the document a name (it makes things much easier for us later on) and a description (in a future version applications will be able to prefill these fields with some meaningful defaults). But the interesting part is the meta data. The dialog suggests certain possible annotations which we can approve). Below the recently used annotations we have the possiblity to add any annotation we want through the existing Nepomuk annotation system. Last but not least we can give the document a type. This type does not identify the document on a mime-type level but much more real-life oriented. The idea is that users either define their own types based on pimo:Document or use ontologies that provide them. Typical examples include invoices or letters or project descriptions. This way documents are saved on a much higher abstraction level than with the classical file chooser: instead of a text file we save an invoice.

Once we specified the meta data we want to apply to the new document and hit the save button the smart save dialog generates a folder and file name and saves the document. We do not need to care about the location.

(Hint: there are certainly situations in which we want to use the classic file chooser. That is why the smart save dialog allows to switch over to the old ways by the simple click of a button.)

Loading Documents

But if documents are saved in some random folder which we do not know, how do we find them again? Well, that is the real beauty of the new approach. The idea is that you tell the open dialog what you want to open by specifying some details that you remember.

Let us have a look at the smart open dialog as it opens from within Okular.

smartopen-okular1

We see two main views: on the left hand side we see a list of filters and on the right hand side we see a long list of files/documents. This might look overwhelming in the beginning but wait until we specify the first detail about the document we want to open: we tell the dialog that the document has mime type image/png (Yes, in the future this will look less technical) and the file view changes only showing png images.

smartopen-okular2

These are still way too many to search for the one we need, so we give more detail. We remember that we accessed the document sometime this week:

smartopen-okular3

Again the list of files is changed and now after only choosing two filters we are down to seven documents to choose from. Although this would be enough we do one better just to show that the filter system obviously also includes manual annotations such as tags:

smartopen-okular4

And after activating the tag filter we are down to a single document. Nice, isn’t it?

A Few Technical Details

There are a few technical aspects worth mentioning about Alessandro’s work.

First of all: he makes direct use of Adam’s work on the virtual folders. The file list on the right is a simple KDirModel listing a nepomuksearch:/?sparql=… query. I find this very nice as my two students shared knowledge and discussed their work to find good solutions to their problems.

The second thing I find important is the creation of the filter list. The list of filters is created dynamically based on the existing annotations of the files in the current selection. In essence the idea is to only show filters that would actually change the list of available files (as you can see in the last screenshot this does not work 100% yet but we are close).

The GUI is obviously a prototype and we hope that you will give feedback and ideas to improve its usability. As Adam, Alessandro will continue working on KDE and Nepomuk and the smart file dialog will evolve until KDE 4.4.

Try It

To test the smart file dialog you need three things:

  1. My kdelibs patch which makes the KFileDialog pluggable. This is actually a very simple one as the file dialog already loads the backend from a separate lib. While you are on it, please review the patch so it can get into KDE 4.4.
  2. The Nepomuk-KDE playground module which also contains the smart save dialog. I recommend installing the whole module as the smart save dialog makes use of pretty much every Nepomuk lib available.
  3. Tell KFileDialog to load the smartfilemodule instead of the default by adding “file module=smartfilemodule” into the “KFileDialog Settings” group of kdeglobals.

Obviously nepomuk needs to be enabled for it to work. Have fun.