Home > Nepomuk, Soprano > And Yet Another Post About Virtuoso

And Yet Another Post About Virtuoso

October 14, 2009 Leave a comment Go to comments

Today nearly all problems are solved. OpenLink provided a patch that makes inserting very large literals (more than 1 metabyte in size) lightning fast, even with a very low buffer count. Also I worked around the issue of URI encoding. Now the Soprano Virtuoso backend simply percent-encodes all non-unreserved characters and all reserved characters that are not used in their special meaning in URIs used in queries. Man, that is a mouth full. Well, it seems to work fine although I can always use more testing with weird file URLs (weird means containing weird characters like brackets and the likes). I also fixed some error handling bugs.

So what is left? Well, there are a few hacks in the Virtuoso backend which are rather ugly. One example is the detection of query result types. To determine if the result is boolean, bindings, or a graph it actually checks the name and number of result columns. Urgh! It would be nicer to check for the type of the result. Seems like graph results are BLOBs.

Anyway, enough for tonight. I am tired. Here is the patch to make Virtuoso not hang when Strigi adds nie:PlainTextContent literals of big files:

Index: sqlrcomp.c
===================================================================
RCS file: virtuoso-opensource/libsrc/Wi/sqlrcomp.c,v
retrieving revision 1.9
diff -u -r1.9 sqlrcomp.c
--- sqlrcomp.c  20 Aug 2009 17:47:22 -0000      1.9
+++ sqlrcomp.c  13 Oct 2009 16:11:49 -0000
@@ -65,7 +65,7 @@
 {
 va_list list;
 char temp[2000];
-  int ret;
+  int ret, rest_sz, copybytes;
 va_start (list, string);
 ret = vsnprintf (temp, sizeof (temp), string, list);
 #ifndef NDEBUG
@@ -75,11 +75,16 @@
 va_end (list);
 #ifndef NDEBUG
 if (*fill + strlen (temp) > len - 1)
-    GPF_T1 ("overflow in strncpy");
+    GPF_T1 ("overflow in memcpy");
 #endif
-  strncpy (&text[*fill], temp, len - *fill - 1);
+  rest_sz = (len - fill[0]);
+  if (ret >= rest_sz)
+    copybytes = ((rest_sz > 0) ? rest_sz : 0);
+  else
+    copybytes = ret+1;
+  memcpy (text+fill[0], temp, copybytes);
 text[len - 1] = 0;
-  *fill += (int) strlen (temp);
+  fill[0] += ret;
 }
  1. October 15, 2009 at 10:32 | #1

    “more than 1 metabyte in size” wow, does it run on a one jiggawatt processor!

  2. October 15, 2009 at 10:34 | #2

    Great stuff BTW looking forward to using nepomuk+virtuoso

  3. drf
    October 15, 2009 at 12:05 | #3

    Do you know if this patch will be applied upstream or packagers have to apply it theirselves?

    • October 15, 2009 at 12:19 | #4

      It will be applied upstream. It already is in the OpenLink svn AFAIK.

      • drf
        October 20, 2009 at 11:02 | #5

        Just checked and it’s already in the virtuoso 5.0.12 package

  4. October 28, 2009 at 14:31 | #6

    Documentation on how to extract ODBC meta information on the RDF result is now available from the online documentation:

    http://docs.openlinksw.com/virtuoso/virtodbcsparql.html

  1. October 14, 2009 at 21:20 | #1