Files on Removable Media – Step 1


So far Nepomuk only handled annotations for local files and did not care about mount points and the like. With KDE SC 4.4 that is about to change. (What I present here today is the first step in supporting removable media like USB keys or external hard drives. The next step will have to wait until 4.5.)

As always my blog will have two parts: the user visible one and the technical one which discusses implementation details.

Imagine you had a USB key with a file on it which you annotated, say with a rating of 6:

As long as the key is mounted searching for files with a rating of 6 will always return our image the way we know it:

But now we unmount the key. Thus, the file will not be accessible anymore. But the file still shows up in the search:

However, there is a slight change in the name: now it contains a hint to the USB key (this hint did not make it into 4.4). Opening the file still works since the key is automatically mounted.

But what happens if we remove the key completely and thus, auto-mounting is not an option anymore? Well, the search does still return the file but trying to open it gives us an error. Well, Gwenview does not handle KIO errors properly, thus we do not get an error message. But in theory we would get the message “Please insert the removable medium ‘1,9 GiB Removable Media’ to access this file.”. Just to show that I am not lying let me present the Okular error dialog (which could use some improvement, too):

Here you see the whole ugly Nepomuk query URL and all at the bottom the unformatted error message. Hopefully in KDE SC 4.5 we will have worked out these problems.

So if we would see this error message we would know where to find the file if we want to access it (well, giving proper names to USB keys would help, too).

This is already pretty nice. Step 2 will then be to export the annotations to the removable storage and sync them again as soon as the medium is mounted (remember how I blogged about my experiments with that already? Well, as you can see I did not get that finished, yet.)

The Technical Part

OK, now we know how it looks to the user (or how it should look). Let us have a look under the hood. Basically three players are involved in this process:

The removable storage service

The removable storage service (code in kdebase) uses the great power of Solid to act on newly inserted, mounted, and unmounted removable storage devices. The simple part is that it tells the Strigi service to index the files on the device on mount.

More interesting, however, is what happens on unmount. The service converts all absolute URLs of files on the unmounted device to relative ones. These relative URLs use the filex:/ scheme I made up and consist of two part: the UUID of the storage device and the relative path. In our example above the URL is filex://fc30-3da9/thepic.JPG. In addition a new nfo:Filesystem resource is created storing the description and the UUID of the unmounted device. The files are then related to this new nfo:Filesystem resource via nie:isPartOf.

libnepomuk

Nepomuk::Resource can now transparently handle relative filex:/ URLs. Thus, annotating the file in the example after remounting will store the annotations with the same resource although that uses a filex:/ URL.

The nepomuk:/ KIO slave

The nepomuk:/ KIO slave (code in kdebase) does the rest of the work. The nepomuksearch:/ KIO slave creates the virtual query folders but uses the nepomuk:/ KIO slave to stat all resources (at least the ones with a nepomuk:/ scheme URI).

So as soon as a relative filex:/ URL is encountered it is converted to a local URL if possible:

Solid::StorageAccess* storageFromUUID( const QString& uuid ) {
    QString solidQuery = QString::fromLatin1( "[ StorageVolume.usage=='FileSystem' AND StorageVolume.uuid=='%1' ]" ).arg( uuid.toLower() );
    QList<Solid::Device> devices = Solid::Device::listFromQuery( solidQuery );
    if ( !devices.isEmpty() )
        return devices.first().as<Solid::StorageAccess>();
    else
        return 0;
}

KUrl convertRemovableMediaFileUrl( const KUrl& url, bool evenMountIfNecessary = false ) {
    Solid::StorageAccess* storage = storageFromUUID( url.host() );
    if ( storage &&
         ( storage->isAccessible() ||
           ( evenMountIfNecessary && mountAndWait( storage ) ) ) ) {
        return storage->filePath() + QLatin1String( "/" ) + url.path();
    }
    else {
        return KUrl();
    }
}

And here you can already see the auto-mounting code being called. (I do not show it here since this is enough to read already. If you are interested have a look at the full source code.) The converted URL is then simply passed to KIO::ForwardingSlaveBase which handles the rest. In case the URL cannot be converted (the medium is not mounted and auto-mounting is not used) all information is read from the Nepomuk database to create a proper KIO::UDSEntry.

26 thoughts on “Files on Removable Media – Step 1

  1. Isn’t there still time to fix these error messages for KDE SC 4.4? As far as I know there are still few days left until the release and these definitely look like a bug to me so they fall under RC bugfixing.

    • Well, the issue is that each application has to handle KIO errors itself. Thus, fixing this involves changes in multiple applications. I had a quick look at Gwenview’s code. To me it looks like a bigger thing to fix that. In case of Okular you are right. That could be simpler. I am not sure about other applications though.

      • Sounds like an error handler/visualizer API (ideally using the notification system or something) is needed right in KIO (I’m actually surprised it isn’t done this way already), so that all apps can be moved to access that API instead having to implement a KIO error handler each on their own.

  2. Really useful but, are you thinking about export all this information to USB itself?

    Consider the scenario when you tag all your pictures, music and videos in an external storage and then plug then in other computer.

    A xml or text file in root, visible or hidden, with an easy export/import method could be a good solution, I doubt that virtuoso supports tables in different places, and, in the far future, integration with ownCloud would be cool :).

  3. All of this, obviously, mean that there are Nepomuk builds newer than 2.3.70. Is the release of 2.3.80 near? I write this because Nepomuk queries in Dolphin are broken with 2.3.70, and AFAIK that was solved in Mandriva with newer snapshots (~2.3.71)

  4. Sorry, but I don’t get the whole idea:

    USB sticks are mainly used for data transfer, not archiving, and usually used on more than one computer. So chances are high that Nepomuk may still store metadata for files that are long gone. How am I ever get rid of those entries? They’ll constantly clutter my searches. As long as the file system UUID stays the same it could automatically remove the entries the next time I plug in the same stick, but what if i reformat it? That happens a lot to thos devices.

    And Nepomuk will also start indexing all the “foreign” USB sticks I only plug in once in a lifetime, e.g. those from my friends.

    • You are correct. The system is not perfect yet. That is also why indexing removable device is disabled by default. You have to manually enable it. Thus, only manual annotations like tags and such are stored by default.
      Like I said this is the first step of the system. Once the metadata is stored on the removable device as well the whole system gets cleaner. Then it would also be possible to remove all metadata on unmounting if the is desired. Once the key is mounted again the data could be re-imported.

      • I think having users being able to say whether they want to store the metadata on the computer, on the removable device, both, or not index it at all would be the best bet. Especially if you could set a default as well as a different option on a per-device basis would be best. For instance I probably would not want to index my camera or my backup hard drive at all, for a secure encrypted drive I would want the index only stored on the drive, and for a network drive I would want it stored both locally and on the other drive. So having a per-device option would be very important in my opinion.

        Speaking of network drives, if you have a shared folder, would it be possible to keep the a local nepomuk index in that folder that is visible to other computers, and have nepomuk automatically keep it synced with the computer’s general index. So for instance if you move a file to that folder, it’s nepomuk information immediately becomes visible to users browsing the folder over the network, and if someone who has write access on the network changes the nepomuk information it also changes in your local system.

  5. You have said that you would like to see big database using projects like Akonadi, Amarok and DigiKam sharing nepomuk’s database.

    Akonadi appears to already be heading in that direction, DigiKam seems to be keen to move once performance equivalence can be established and Amarok seems perhaps less keen.

    Is there a plan to establish what (if anything) is still blocking the data/semantic integration across the desktop? There is clearly a lot of desire to make it happen but still quite a lot of uncertainty.

    • I personally did not hear anything regarding that issue from the Digikam developers nor the Amarok people. The contact with the Akonadi team, however, is very good. They are indeed working towards using Virtuoso. A lot should happen in 2010.

      • One way to share with Digikam or Amarok without them doing anything is by enabling Nepomuk to pull data from (and write data to) IPTC image tags and ID3 audio tags. That would provide a convenient way of exchanging data and allowing it to stay with the file when moved.

        • Tracker provides a writeback service based on GStreamer. Maybe someone could have a look into it to see if it can be used for Nepomuk.
          However, I personally do not see why Nepomuk needs writeback support. I see it this way: Amarok or Digikam write the new data to the file and Nepomuk updates its index with the new info.

  6. Hello,

    some time ago there has been a discussion how meta information can be sticked to the file if the connection is not implemented on file system level. E.g. moving files using a tool (cp) which is not aware of the metadata how can the coherence of data and metadata be guaranted. I never have heard about a solution to this, but I am shure one has been found by our excelent free software engenieeres (just to make it clear: this is _not_ to be ment sarcastic). Can someone please tell me the solution or point me somewhere I can read about it?

    Thank you. Wolfgang

    P.s.: Handling removable devices in a sensible way is great news!

    • I have a patch using inotify which works nicely but depends on the max user watch value (/proc/sys/fs/inotify/max_user_watches) being raised to a high number (like 524288). Apparently some distributions already do that. However, my patch does not include making sure to leave a few hundred of them free yet. Thus, it could in theory keep other parts of KDE from working correctly. I will, however, polish that patch and make it part of KDE SC 4.5 and propose it to distributions.

  7. i am already using nepomuk and strigi , and liking it very much.

    great news about usb removable files. its also great that they appear on the search and indicate the media when offline.

    heck :) if you delivered this earlier , you could have saved me hours and hours of saving multimedia information on tellico about my usb disks files :)

    this is a life saver for me :) thanks

  8. Pingback: A Word (or Two) on Removable Storage Media Handling in Nepomuk | Trueg's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s