Something Way Less Dry: TV Shows

After my rather boring blog about change notifications I will now to write about something that I wanted every since I started developing Nepomuk. But only now has Nepomuk reached a point where it provides all the necessary pieces. I am talking about TV Show management – obviously I mean the rips from the DVD boxes I own.

So what about it? Well, I wrote a little tool called nepomuktvnamer (inspired by the great python tool tvnamer) which works a bit like our nepomukindexer except that it does not extract meta-data from the file but tries to fetch information about TV Shows from thetvdb.com. You can run the tool on a single file or recursively on a whole directory. It will then use a set of regular expressions (based on the ones from tvnamer)  to analyze the file names and extract the show title, season and episode numbers.

The nepomuktvnamer will ask the user in case multiple matches have been found and cannot be filtered according to season and episode numbers

It will then save that information into Nepomuk through our powerful Data Management API. The code looks a bit as follows ignoring code to store actors, banners and the like.

const Tvdb::Series series = getSeriesForName(name);
Nepomuk::NMM::TVSeries seriesRes;
seriesRes.setTitle(series.name());
seriesRes.addDescription(series.overview());

Nepomuk::NMM::TVShow episodeRes(url);
episodeRes.setEpisodeNumber(episode);
episodeRes.setSeason(season);
episodeRes.setTitle(series[season][episode].name());
episodeRes.setSynopsis(series[season][episode].overview());
episodeRes.setReleaseDate(QDateTime(series[season][episode].firstAired(), QTime(), Qt::UTC));
episodeRes.setGenres(series.genres());

seriesRes.addEpisode(episodeRes.uri());
episodeRes.setSeries(seriesRes.uri());

Nepomuk::SimpleResourceGraph graph;
graph << episodeRes << seriesRes;
Nepomuk::storeResources(graph, Nepomuk::IdentifyNew, Nepomuk::OverwriteProperties)

(This code uses my very own LibTvdb which is essentially a Qt’ish wrapper around the thetvdb.org API.)

The result of this can be seen in Dolphin:

Here we see the actors, the series, the synopsis and so on. Clicking on an actor will bring up all they played in, clicking on the series will bring up all the episodes from that series, and so on.

Now let us have a look at the series itself using my beefed up version of the Nepomuk KIO slave:

As we can see the nepomuktvnamer also fetched a banner which is stored as nie:depiction. (A reason why to compile nepomuktvnamer you need the git master version of shared-desktop-ontologies. Oh, and also nepomuktvnamer is linked against libnepomukcore from nepomuk-core instead of libnepomuk. So you either have to install nepomuk-core which cab be a bit tricky or quickly change the CMakeLists.txt to link to libnepomuk instead.)

We can of course also query the newly created information. Simple queries in Dolphin could be “series:Sherlock” or “sherlock season=1”. Well, things to play with.

I also created the smallest Nepomuk service to date: the nepomuktvnamerservice uses the ResourceWatcher to listen for newly created nfo:Video resources and simply calls the nepomuktvnamer on the related file.

Last but not least the git repository contains a python script which checks for each existing series if a new episode has been aired. The output looks a bit like this:

White Collar - New episode "Withdrawal" (02x01) first aired 13 July 2010.
Freaks and Geeks - No new episode found.
The Mentalist - Upcoming episode "Red is the New Black" (04x13) will air 02 February 2012.

Now obviously this is more a task for a Plasma applet. So if anyone out there is interested in doing that – please go ahead. I think it could be a cool thing. One basically only has to update whenever a new nmm:TVShow is created or when the new day dawns.

And the cherry on top is of course Bangarang:

Nepomuk – What Comes Next

After a very generous start to my fundraiser (thank you so much for your support) it is time I get into more detail about what you are actually supporting. Originally I wanted to do that by updating nepomuk.kde.org. I will still do that but it will take a little more time than anticipated. Thus, I will simply start with another blog post.

Well then, apart from cleaning out the bug database at bugs.kde.org (this will be a hard one), continuing to support app developers with Nepomuk integration, maintaining the whole Nepomuk stack, Soprano, the Shared-desktop-ontologies, and some smaller Nepomuk-based applications there are some very specific tasks I want to work on in the near future (In this case the near-future roughly spans the next half year).

Semantic Saving and Loading of Documents

Pretty much forever we have managed documents in a very nerdy manner: the way they are stored on the local file system. We navigate physical folders, create complex hierarchies, get lost in them, recreate parts of them, never find our files again, and still keep on doing it.

The vision I have is that we do not think about folders at all any more since for me they are a restriction of the 3-dimensional world that has no place in a computer. A document on the real world can only be archived in a single folder. On the computer there is no such restriction. We can do whatever we want. Thus, the idea is to organize documents closer to the way our brain organizes information: based on context and familiar topics and relations.

This vision, however, is not feasible in the near future. There is simply too much legacy data and too many applications relying on the classical folder structure. Thus, the idea would be a hybrid approach which combines classical local folders with advanced semantic relations and meta-data. This is a development which I already started with fantastic input from the community.

The next steps include finishing the prototype and creating its counterpart, the file open dialog. This will be a very tough one for which I will ask your support again since that works out so great with the save dialog.

Excerpts

A typical use case is bookmarking pages or copying specific parts of a document into some collage of snippets. However, as always we loose the relation to the source. This is were Nepomuk will shine: instead of copying the part of the document we simply define the excerpt (the portion the user is interested in. This can be a section which is marked, it can be a specific position in the document ranging up to its end, or it can be part of an image.) as a resource in Nepomuk which we can annotate like any other resource. This means that we relate it to topics, people, projects, files, other snippets, web pages, comment on it, and so on – all the while we keep the relation to the original document

This allows for nice things like automatic collages (think of selecting all snippets which mention a certain topic or relate to a certain project and were created before some date and merging them all into one view), simpler quoting of things you read before (since the relation to the original document is in tact you have easy access to the details required for the quote – very interesting for academic workers), and a simple listing of all interesting quotes from documents by some person you like (an example query).

Sharing Nepomuk Data – Step 1

Whenever we create information we want to share it with others. Vishesh Handa already started a very ambitious project to support several types of data sharing through a plugin system. What I want to do first is much less but nonetheless interesting: sharing bits of Nepomuk data manually.

This means that you define the information you want to share and then simply export it into a file which you can then send to someone else. They in turn can import this information into their own Nepomuk system. For starters there will be tracking of origin of the data or anything like keeping two ratings at the same time. That is for later.

This is a very simple first step to sharing which should be fairly easy to implement, the GUI being the only actually hard part. The Data Management Service already takes care of export and import for us.

Once this works adding the same to EMail sending or Telepathy communications ill be very simple. In fact the Telepathy-KDE guys (namely Daniele E. Domenichell aka Dr. Danz) have been interesting in that for a long time. (I wish I were with you guys at Cambridge now!)

To this end I will probably finally get to work on Ginkgo, the generic Nepomuk resource management tool developed by Mandriva’s Stephane Lauriere.

For App Developers: Resource Watcher

For the longest time the only way of getting notified of changes in the Nepomuk database were the very generic Soprano signals Model::statementsAdded and Model::statementsRemoved. Checking for specific changes meant to check each statement which was added or removed or doing a pull each time one of those signals was emitted. An ugly and not very fast solution.

With the introduction of the Data Management Service this will finally change. We already have a draft API for the Nepomuk::ResourceWatcher which allows to opt in for change notifications of different kinds: changes on specific resources, new resource of specific types, changes to specific properties.

The initial API is there and partially integrated with the Data Management Service already. However, I would like to add some more nice features like only watching for non-indexed data or excluding changes done by a specific application (useful for an app which does changes itself and does not want to bother with that data). Also integration into the DMS needs to be finished as not all features exposed in the API are supported yet.

The technical aspect: KDE frameworks

With KDE 5.0 kdelibs and kde-runtime will be split into smaller parts to make it simpler for application developers to depend our powerful technologies. This also means a split for Nepomuk. I already started this split but a lot more work needs to be done to make Nepomuk an independent part in the KDE frameworks family.

Part of this also involves getting rid of deprecated legacy API and improving API where we were previously restricted by binary-compatibility issues.

So this is it for now. Reading over it again I get the feeling that it might be too much already – especially since I am fairly certain that new things will pop up all over the place. Nonetheless I will try to stay the course for once. ;)

Thanks again for your support.

Click here to lend your support to: Nepomuk - The semantic desktop on KDE and make a donation at www.pledgie.com !

Randa and Ontologies and whatnot…

So I am back from Randa, the small town in the Alpes where roughly 60 KDE hackers (and 2 Gnome people) met to achieve great things. Before I start raving on about Nepomuk and ontologies and all that jazz let me thank Mario Fux and his family for the great organization and the fabulous food.

Now that the pink fluffy bunny part of the blog post is done I can get to the real thing: Nepomuk. The guys (Daniele E. Domenichelli from Telapathy-KDE, GSoC students Smit Shah and Martin Klapetek, Ivan Cukic, and Seif Lotfy and Trever Fischer from Zeitgeist) and me met in Africa (for the unaware: the rooms in the house in Randa are named after continents). I gave a few short tutorials on Nepomuk, its design, the basics like RDF and how it is used in Nepomuk, and the new Data Management Service (which I still did not blog about yet – but it will come, give it time). Then I answered questions and helped with the individual projects.

Apart from the discussions on KDEPIM and Telepathy integration two topics stand out: 1. The new and improved Documentation for the Shared-Desktop-Ontologies and 2. The integration between the KDE Activity Manager, Nepomuk, and Zeitgeist.

New and Improved SDO Documentation

One thing I wanted to do while in Randa was improving the documentation for Nepomuk. Well, we did not really look much into Nepomuk’s own documentation but Daniele E. Domenichelli (Telepathy-KDE hacker and former GSoC student) and myself finished what I started a while back. We finished the script which extracts all ontology entities from the sources and generates nice docbook references. We split the manually created docbook documentation into subfiles which are combined into one docbook chapter per ontology. Then the chapters are merged into one big file which is transformed into html via xslt. All is streamlined through the cmake build system’s “docs” target. Thus, everyone can easily create their own ontology documentation at home. If you do not want to do that but want to check it out anyway without waiting for an update on semanticdesktop.org just use the oscaf project’s page which now contains this documentation.

Integration Between KDE Activity Manager, Nepomuk and Zeitgeist

Zeitgeist is an interesting project which in its topic is closely related to Nepomuk: Zeitgeist tracks events on the desktop and stores them in a database which can then be queried by clients. This allows for example to see which files you touched in the last week and track the history of a single file (here “history” refers to the modification events, not the content). Lately Zeitgeist developers have moved closer towards KDE and shown a lot of interest to work with us. At Randa we finally came up with a good plan to do so.

The problem is that KDE already has a database to store all kinds of information including events: Nepomuk. And KDE has the KDE activity manager whose API is supposed to be used by KDE applications to inform about opened/modified/closed files. Thus, we had three things to bring together: Zeitgeist and its application plugins which inform Zeitgeist about events, the KDE activity manager which does the same thing for KDE applications, and Nepomuk as the one semantic database in KDE. Thus, Seif Lotfy (Zeitgeist founder), Trever Fischer (GSoC student doing a Zeitgeist GUI for KDE), Ivan Cukic (Author of the KDE activity manager), and myself discussed the different issues heatedly and actually came to a conclusion.

The basic design will be like this: KDE applications will use the KDE Activity Manager Daemon (KAMD) API to inform about opened, modified, and closed files. The KAMD will then send these events to Zeitgeist which will apply all its magic (blacklisting, geo location attachment, and so on). Finally Zeitgeist will store the events in Nepomuk. If Zeitgeist is not available KAMD will push the events to Nepomuk itself.

Integration between KAMD, Zeitgeist, and Nepomuk

This way we benefit from Zeitgeist’s additional event processing and its integration into applications like OpenOffice or even vim.

Update: Ivan told me that he has a much nicer diagram. And in fact he has. His looks like a mutated bunny with one Gnome and one KDE ear:

Ivan is the master of simple diagrams (mine are always too complex)

We also had to merge ontologies. Zeitgeist has its own event ontology which is built upon the Nepomuk ontologies. We tried to merge as much of it into SDO as made sense. The result can be seen as a separate branch in the SDO git repository. We decided to store usage events as follows:

Usage Events in Nepomuk

If looking at the example of a file being opened NUAO models one main event which stretches from the time the file is opened to the time it is closed. This is described via nuao:UsageEvent and the properties nuao:start and nuao:end. During the time span of this event the file can be in the focus of the user or not. This is modelled via nuao:FocusEvent instances that describe when a file was in focus.

Since the action of modifying a resource is a very foggy concept and hard to fit into an event with start and end time it was decided to instead only store a timestamp for each modification of the resource. This is described via nie:modified and nie:contentModified. For file modifications one would typically use nie:modified.

Storing all focus events in Nepomuk will most likely result in an ugly amount of data which does not provide any really useful information besides the total time a resource was in focus. Thus, we introduced nuao:totalFocusDuration which allows Zeitgeist to compress all focus events and attach this resulting duration to the enclosing nuao:UsageEvent.

Well, that is it for today. Aaron urged me to write a dot article about Randa so I will probably try to do that (he says vaguely).