Nepomuk – What Comes Next – Revised


After reading your comments on my last blog about the next steps in Nepomuk and discussing with several people I decided to revise my plan for the next half year. Instead of directly diving into cool new features I will start with four very simple but very important topics:

  1. Fix all the crashes (and as many of the other bugs as possible) that have been reported on bugs.kde.org.
  2. Improve the Nepomuk startup time and lower the IO usage during Nepomuk startup.
  3. Fix any left-over issues in libstreamanalyzer (aka the strigi lib, used for file indexing in Nepomuk), meaning proper indexing of all files. This includes pdf problems, source code files, fonts, and others that might still be problematic.
  4. Properly show the search excerpts in Dolphin.

Four rather simple points that make a lot of difference for a lot of people. You should get exactly what you want from the fundraiser and nothing is more important for users and application developers alike than stability and performance. A fact that I sometimes loose sight of due to my “researchy” way of working.

Only once this is finished will I get back to creating cool new ways to expose the semantic desktop (in a stable system).

Hoping that this new focus in my work matches the needs of most of you let me spam you again with these two buttons:

Click here to lend your support to: Nepomuk - The semantic desktop on KDE and make a donation at www.pledgie.com !
Click here to donate to Nepomukvia Moneybookers

Thanks again for your support, for your donations, but also for spreading the word on your blogs, on twitter, providing translations, and for the many encouraging comments. It is great to be part of the KDE community and I hope that it can last for a very long time.

56 thoughts on “Nepomuk – What Comes Next – Revised

  1. “Instead of directly diving into cool new features I will start with four very simple but very important topics”

    Wow, you are sure you are still KDE-developer? ;-) This is the way to go, sent you some ammo.

  2. That sound’s like a very good idea. :)

    Might add the memory-footprint to this. Just today I killed the “nepomukservicestub nepomukfilewatch” process, because it used nearly 2GB of RAM.

    If I can help you figuring out this issue (this happens infrequently, but every other day it will happen) drop me a note. :)

  3. I hope you get your “researchy” way of working back, people should do what makes them happy. Either way, fixing all this bugs will make kde users love you and I hope this will make you happy as well.

  4. I’ll continue donating the next 4 months – just set a reminder in my calendar to do this on every month’s 1st.

    @Everyone else: please do this, too!

  5. Maybe your new four topics and your former plans are somehow connected. Like in the way “Implementing feature x kills bugs 1,2 and 3” or so.

    Ah, and don’T forget to backport some of those bugfixes to 4.7 at least.

    By the way: It is possible that there is no LTS-like version of KDE SC?

  6. Doing a round of nothing but polishing and performance work to solidify what has been done to date sounds like a very good plan to me.

  7. I must admit that I never liked nepomuk very much — although I can certainly see the potential, the multiple issues it has caused since introduction have put me off using it a lot.

    With that said, I just added 20€ to help further your new plan — nepomuk is an important part of KDE, and having it rock-stable and lightweight is the basis of an excellent user experience.

  8. Hi.

    Sorry if I’m missing something, but I’m astonished to read that those points are the “new approach”. This explains perfectly why Nepomuk is so unliked:

    1. Nepomuk is somewhat unstable (or for some use cases).
    2. Nepomuk is somewhat slow (or for somewhat use cases).
    3. Nepomuk relies on a library/framework that also has issues.
    4. Nepomuk does things that are stored, but are not useful to end users because are not read back.

    I understand that you might like more your research than actually providing something useful to end users, and doing all the user support. That’s perfectly normal. What I don’t understand is why other KDE developers decide that is a good idea to rely on Nepomuk.

    I will love to eat my words some time in the future, but despite all the effort that I put in reading carefully Planet KDE, I still fail to see real life features that involve Nepomuk.

    Each day I see more evidence that Nepomuk is the new aRts.

    • Nepomuk doesn’t power analog synthesizers :P

      See my posts: half of the Nepomuk issues are produced by:
      a) Miscompiled components and build issues with distros.
      b) A package, strigi-libs, essential to file indexing, that hasn’t been updated in more than a year by Debian.

      Clearly, that didn’t happened to aRTs; aRTs was unmaintained, plain and simple.

  9. I think that you’ve already began to fix libstreamanalyzer, but that work receives no publicity. I’m tracking the master branch, and only experience issues with FLACs, *some* MP3 and TTFs. However, there are lots of people who have Strigi libraries 0.7.2. They are missing more than a year of bugfixing, and they complain to you about libstreamanalyzer… how pathetic…

    I’ve filed this bug against Fedora, quickly solved by Rex Dieter and Kevin Kofler:
    https://bugzilla.redhat.com/show_bug.cgi?id=726507

    I would ask all whiners that:
    1. Check your strigi-libs package release.
    2. If it’s 0.7.2, FILE A BUG IN YOUR DISTRO.
    3. Sebastian, we’ll need an imminent 0.7.6 release with all the fixes currently in master. The new PDF indexer can wait until a later release, “release early, release often” :)

    Thanks for working in Nepomuk. We all support you.

  10. @Ernesto: It’s not appropriate and not compatible with the community spirit to go around and call people pathethic and whiners when realistically they’re not in a position to know about what’s fixed in what version or even whether their distro has been lazy in updating a component. Users aren’t developers. I understand that it’s frustrating for you as a developer if your bugfixing work doesn’t get through to users because distros are being slow, but that’s not the user’s fault.

    I realize that a lot of people aren’t exactly polite in voicing their criticisms, but resist the temptation to answer in kind if possible, or this sort of dialogue just gets away from the facts and constructive talk further and further.

    • What I’ve called pathetic is not people, is not someone, is an attitude to attribute someone else’s faults to the original author of a software program, simply because he/she hasn’t checked the facts. That attitude is not constructive and must be rejected.

      I apologize for not being clear on that. I never intended to call *someone* pathetic.

  11. Ok, thats news i like to hear. Until today, on every fresh install i deactivated nepokum after the first crash it had. Which was usually during the first minutes or at the latest after the next reboot.

    Going to donate a little to help :)

  12. Glad to see that nepomuk will improve in the future. I think the cause of its bad reputation is:

    – No applications that use it
    -> thats why nobody uses it
    -> thats why nobody files bug reports
    -> thats why its buggy
    -> thats why nobody uses it or writes applications for it
    -> start over

    Solution:
    We need a better search GUI, ideally something like the OS X spotlight but directly integrated into kickoff. The dolphin UI is not prominent enough and way too ugly. I know nepomuk is not (only) about searching but i promise ppl will donate like crazy once there is a proper search GUI in kde.

    • A good search method is highly required. I develop my own, with a query syntax similar to google’s one and with a proper support to nmm ontologies, because implemented search methods are useless and many times fails if you don’t use only ascii characters.

      Until Nepomuk don’t implements a good search, I liked the system implemented in KDE 4.5 that was removed :?, people can’t see the advantages of using Nepomuk and they only can see problems.

  13. I thank you, sir! After so much listening to us whining users, there is really no excuse for me not to donate as well. Also, with a drive to stability and performance, it is our, the whining users, duty to submit bugreports.
    I’ll be watching this graph (and I am pleasantly surprised by it right now)
    https://bugs.kde.org/reports.cgi?product=nepomuk&output=show_chart&datasets=NEW&datasets=ASSIGNED&datasets=REOPENED&datasets=UNCONFIRMED&datasets=RESOLVED&banner=1

  14. Sebastian, I am a big fan of Nepomuk and I like all the ideas you had proposed earlier, but I am very happy to see the new agenda for the coming months. I really believe that these are the right steps for securing that Nepomuk will be taken up by more components and applications in KDE and receive the users’ love the way it deserves. I just donated some of my rather limited money and wish you all the best with your struggle to find a job that will make it possible for you to keep working on this awesome technology. There must be someone out there who sees its value and is able to sponsor you… until then, I hope we can all chip in to make you survive. :P

    Cheers,

    mutlu

  15. You mention fixing PDF indexing. Please have a look at my pdf analyzer that I wrote some time ago. The patch is lying around in the strigi-devel mailing list:

    http://sourceforge.net/mailarchive/message.php?msg_id=26790347

    Evgeny never had time to really look at it, it seems. I would be delighted to see it used, as it was a rather large effort. It handles font encodings and is capable of reading PDF metadata, although it doesn’t understand utf16 tags yet. I tested it with my library of scientific papers and it extracted the full text from pretty much all files more or less flawlessly. It should be valgrind clean.

    For the reasons mentioned in the above mail, fixing the present problems in the old analyzer is impossible without a rewrite.

  16. Dear Sebastian,

    Thank you so much for revising your plans. This means a lot to me as a KDE user. I am not a developer but recently I have switched off nepomuk and been investigating Recoll and other search engines. It simply was not indexing my stuff and was slow to find it. Thank you – I will keep working with it! I really want a working KDE solution.

    However, please make sure that the cool new features are number 5 on the list!

    Regards,

    Kevin

  17. This is very good news in my opinion. I’ve always felt that the Nepomuk project did things in the reverse order from what I would do – instead on focusing on stability, performance and basic features, in most cases it seems to be about the new cool “semantic stuff” that many users don’t get or don’t want. (Hey, that’s not to say that e.g. stability and performance haven’t improved – it has, a lot. And for that I’m very thankful.)

    For example, there is one feature that I think would boost Nepomuk’s usefulness a lot, and I’ve always wondered why there hasn’t been more work done on it. Do you know about a runner called fsrunner? It indexes your file names and lets you instantly open files and folders by typing their name in KRunner. It’s very simple and lightweight (fsrunner uses its own sqlite database), but it does what it’s supposed to do and has changed the way I use my desktop.

    Now, I know about the Nepomuk runner, but for me it can’t replace fsrunner for mainly two reasons:

    1. It doesn’t show the path to the document or any other information, so I can’t differentiate one README.txt from another.

    2. Since it also does content search, I often get a lot of results, but not the one I want at the top. There should be a smarter ordering system (for example, file name matches first, then content), and maybe there should be a configuration option to only search for file names.

    I just tried to enable it again and was happy to see some nice improvements (for example, it doesn’t crash KRunner anymore, and it shows the mimetype icon instead of the Nepomuk icon). However, I immediately found a new deal breaker:

    3. If I have a file called “nepomuk.txt”, typing “nepomuk” won’t show the file. I have to type “nepomuk.txt” in order to find it.

    I know that Nepomuk isn’t about searching, but it’s one of the things it can do, and I feel that this is something that most users can appreciate whether they like this semantic desktop stuff or not.

    To summarize: If you can make the Nepomuk runner work as well as fsrunner, I bet a lot more users would find it useful to have Nepomuk Semantic Desktop and Strigi Desktop File Indexer enabled.

    • Oh yes, the dot tokenizing problem in Virtuoso. It does not treat the dot as a word boundary. That is of course a problem with file names. I already discussed this with the Virtuoso developers but so far there was no solution.

      • Thanks for your reply, it’s good to know that the bug isn’t under your control. Hopefully there’ll be a fix soon in Virtuoso.

        I still hope that someone will look at the first two points (do you want me to file them to bugs.kde.org, by the way?) and that such “simple” but cool features will get more exposure in future versions of KDE Workspaces.

  18. To end this comment stream:

    1. I’m no developer. I’m just an user, with no programming knowledge, and I’ve experienced the same issues trolls are trolling about. What I’ve done instead of trolling? I’ve done what an user must do, in my opinion: report bugs, research about bugs, and keep in touch with devs. Developers deserve encouragement and support, since a great part of them do this in their spare time.

    2. For the reason above, I will be strong-worded against people who come here to troll, and trolling against devs must not be accepted as a general principle, if we want Free Software to advance. I seriously apologize if I offended someone, but I really think the attitude of coming here and troll, while the Nepomuk main dev is asking the community for support since nobody is paying him (and I expect that situation to change), is not only pathetic, but a very low attack, and must be rejected by the community.

    3. To prevent trolling and to improve the perception around Nepomuk:
    – Let’s compile a list of distros shipping at least Strigi 0.7.5 and another with distros shipping Strigi 0.7.2.
    – Let people in list 2 know that their KDE install WILL have issues with Strigi (constant reindexings, segfaults, for instance) until their distro upgrades the packages.

    • While I understand your position, I believe that if the open desktop is to thrive beyond its very limited share you can’t expect every user to research fixes, help out on bugs, maybe write a small one-liner patch — each day getting closer to being developers.

      Sometimes you just want to *use* software. Things like installing a distro, and right away having to start fixing it, disabling stuff, and reporting bugs drives people away.

      It’s frustrating when an error elsewhere on the stack (upstream, downstream) causes developers to take flak for something that they don’t deserve, but I’m not sure how can this be fixed — although I would love to. On the other hand, you shouldn’t also disrespect users for sticking to being users.

      (Disclaimer: Long time KDE user-and-smalltime developer/fixer here and there.)

    • I couldn’t agree more Ernesto. You are completely right. People tend to forget that KDE is mostly a hobby. Only because this hobby benefits other people doesn’t mean that you can suddenly exhibit a demanding attitude against the developer as if he sold you a product that didn’t fulfill what was advertised. I sometimes feel like people would _never_ complain as much about commercial software at the vendors, e.g. through writing Microsoft or Apple a mail, as they do about free open source software.

      Sorry ianjo, but your comment really shows that attitude as well. To quote: “if the open desktop is to thrive beyond its very limited share”. We’re not talking about a product with a business model here. Most of the work is done by people who work on it out of passion! You can’t dictate them what to do, because you think its the next big step to world domination. People don’t care! They work on it because they like to do it! If the attitude of the lionshare of users would change, that would certainly be the biggest step to more usable open source software.

      I completely agree that it would be nice to have better performance in Strigi and Nepomuk, but that does in no way justify the whining and bitching around of some users, be it here or in other places.

      • I agree with you that this is work done for passion and for free, and as such you can’t (and don’t have a right to) demand anything.

        What I meant is that while every user having to contribute, and report, and beta test, and help fix and etc does work for a small amount of people (and I am one of them, you will not find anything else besides linux anywhere near mine or my family’s computers, and I have contributed my share of fixes), that at some point, if we want free software to be the domain of everyone, we will have to accept and work towards 99.99% of the users being just users.

        And I think we need those users as they need “us”: free (as in freedom) computing should be available for everyone, not just those skilled or even with sufficient time to contribute, and also the more egoistic notion that their usage of non-free computing systems also affects our ability to do things on a free OS.

        I don’t think that “we” “have to” cater to those users. I wish/hope “we” would.
        I hope I’m not getting the wrong message across.

        • If I buy a license for Microsoft Windows 7, and if it doesn’t work, I have the right to demand a fix, and I’m on a consumer position. The same for Apple, or for any proprietary software company.

          Software Libre represents a paradigm shift. Users are not consumers anymore, they are contributors, and they must behave as contributors.

          It’s true you can’t expect every user to contribute. It’s also true that there are ways to make users who don’t have the time, or who don’t have the skills, or both, contribute.

          1. Donations linked to bugfixing. It would be great to create a “priority list” bug list, with slots available to those users who pay a suscription fee. So, if I want my pet bug fixed, I pay a monthly or annual fee and nominate my pet bug in my slot, into the “priority list”. That’s for advanced users.
          2. Paid support. You’ve largely ignored the fact that the greatest amount of users are not those who are aware of Windows, Office, and routinely buy software. The people who need Software Libre the most are people who know little more than how to turn on the thing, how to get into something that will give them access to Facebook and how to get into another thing that will enable them to write a letter. For those people, you can charge money for the install. You smooth all the kinks, they use Software Libre, and, if the user has some issue, they complain back to you. They are consumers, but you remain a contributor to the ecosystem (and if the software has bugs you can’t solve, see 1).

          Software Libre has a great potential for monetization. RMS encourages selling Software Libre, but that potential is virtually untapped because English confuses “free as in freedom” with “free as in price”. That’s why I used the Spanish expression, “Software Libre”, since we have 2 separate and unrelated words for “free as in freedom” (libre) and “free as in price” (gratis).

          So, if the basic user doesn’t want to learn and is not interested, he’ll contribute through his payments, and the money you, the supporter, donate back to the project. On the other hand, if the basic user is interested, he will have a great mentor on you to progress. He also must be educated in a culture where he is not a consumer anymore, but a contributor, be polite, and work on those terms.

          It’s not a perfect solution, but it’s something.

  19. I just donated via Moneybookers.
    I am happy that you chose to revise your proposition, not really because I am having problems with nepomuk (works quite ok overall), but because when I read your first proposal, I knew you were going to feel the heat.
    Let’s hope that by sticking to the basics (I can truly understand that working on researchy stuff is more appealing), you will finally convince to get more and more app developpers on board. That’s my personal gripe with nepomuk, It already can do quite a bit but it’s under-utilized, for a variety of reasons. And you obviously can;t do it all on your own.
    If I were to learn how to code and to jump in (which I may consider soonish), nepomuk would be one of the contenders, for sure.

    Now please, keep updating us on the progress on a regular basis (as it is my second issue with nepomuk: it’s easy to know what may be possible in the future, hard to find out what you can already do!).

  20. Pingback: First Round of Bug Fixes | Trueg's Blog

  21. Thanks to fabo for the (K)ubuntu PPA, I’m on kubuntu/natty and I just updated to the 4.7 experimental stuff and the strigi stuff is really a problem (again).

    It looks as if the only packages that actually depend on libstreamanalyzer are KDE related – is there any reason that the KDE packages don’t depend on a more recent (= less flaky) strigi?

    I have great plans for Nepomuk – it’s one of my personal reasons for staying with KDE.
    Chasing the releases hoping to get a stable version to write my code against is frustrating and disappointing… I guess I learn a little every day (ie: strigi version issues).

    I am really glad that Sebastian is choosing to work out some of the stability issues because the architecture is impressive… I’m happy to donate some money for that work.

    • Sebastian: Maybe you could do a short blog post, to clear up the confusion about Strigi vs. libstreamanalyzer vs. Soprano, why Strigi isn’t part of the KDE infrastructure etc.

      Although I’m relatively close to all this technology, I’m not really sure about all these components anymore how they interact, which are deprecated (e.g the Clucene backend in Soprano). etc.

      This wouldn’t only help all these now enthusiastic testing users to see where possible problems actually happen, but also packagers to understand the relation between these components.

Leave a comment