triplestore – Parerga und Paralipomena http://www.michelepasin.org/blog At the core of all well-founded belief lies belief that is unfounded - Wittgenstein Mon, 10 Nov 2014 13:53:52 +0000 en-US hourly 1 https://wordpress.org/?v=5.2.11 13825966 Installing Stardog triplestore on mac os http://www.michelepasin.org/blog/2014/11/06/installing-stardog-triplestore-on-mac-os/ Thu, 06 Nov 2014 16:02:49 +0000 http://www.michelepasin.org/blog/?p=2530 Stardog is an enterprise-level triplestore developed by clarkparsia.com. It combines tools to store and query RDF data with more advanced features for inference and data analytics – in particular via the built-in Pellet Java reasoner. All of this, combineded with a user experience which is arguably the best you can currently find in the market.

1. Requirements

OSX: Mavericks 10.9.5 (that’s what I used, but it’ll work on older versions too).
JAVA: available from Apple.
Stardog: grab the free community edition at http://www.stardog.com/ (you can also get the ‘developer’ version for a 30-days trial, which is actually what I did).

2. Setting up

Good news, it can’t get any simpler than this. Just unpack the Stardog installer, and you’re pretty much done (see the online docs for more info).

Stardog needs to know where to store its databases, so you do that by adding a couple of lines to your .bash_profile file:


export STARDOG_HOME="/Users/michele.pasin/Data/Stardog"  # databases will be stored here
export PATH="/Applications/stardog-2.2.2/bin:$PATH"  # add stardog commands to the path
alias cdstardog="cd /Applications/stardog-2.2.2"  # just a handy shortcut

Finally, copy the license key file (which should have come together with the installer) into the data folder:

$ cp stardog-license-key.bin $STARDOG_HOME

3. Running Stardog

The stardog-admin server start command is used to start and stop the server. Then you can use the stardog-admin db create command to create a DB and load some data. For example:


[michele.pasin]@Tartaruga:~>cdstardog 

[michele.pasin]@l5611:/Applications/stardog-2.2.2>stardog-admin server start

************************************************************
This copy of Stardog is licensed to MIk (michele.pasin@gmail.com), michelepasin.org
This is a Community license
This license does not expire.
************************************************************

                                                             :;   
                                      ;;                   `;`:   
  `'+',    ::                        `++                    `;:`  
 +###++,  ,#+                        `++                    .     
 ##+.,',  '#+                         ++                     +    
,##      ####++  ####+:   ##,++` .###+++   .####+    ####++++#    
`##+     ####+'  ##+#++   ###++``###'+++  `###'+++  ###`,++,:     
 ####+    ##+        ++.  ##:   ###  `++  ###  `++` ##`  ++:      
  ###++,  ##+        ++,  ##`   ##;  `++  ##:   ++; ##,  ++:      
    ;+++  ##+    ####++,  ##`   ##:  `++  ##:   ++' ;##'#++       
     ;++  ##+   ###  ++,  ##`   ##'  `++  ##;   ++:  ####+        
,.   +++  ##+   ##:  ++,  ##`   ###  `++  ###  .++  '#;           
,####++'  +##++ ###+#+++` ##`   :####+++  `####++'  ;####++`      
`####+;    ##++  ###+,++` ##`    ;###:++   `###+;   `###++++      
                                                    ##   `++      
                                                   .##   ;++      
                                                    #####++`      
                                                     `;;;.        

************************************************************
Stardog server 2.2.2 started on Thu Nov 06 16:41:23 GMT 2014.

Stardog server is listening on all network interfaces.
SNARL server available at snarl://localhost:5820.
HTTP server available at http://localhost:5820.

STARDOG_HOME=/Users/michele.pasin/Data/Stardog 

LOG_FILE=/Users/michele.pasin/Data/Stardog/stardog.log


[michele.pasin]@l5611:/Applications/stardog-2.2.2>stardog-admin db create -n myDB examples/data/University0_0.owl
Bulk loading data to new database.
Parsing triples: 100% complete in 00:00:00 (8.6K triples - 13.2K triples/sec)
Parsing triples finished in 00:00:00.646
Creating index: 100% complete in 00:00:00 (93.0K triples/sec)
Creating index finished in 00:00:00.092
Computing statistics: 100% complete in 00:00:00 (60.9K triples/sec)
Computing statistics finished in 00:00:00.140
Loading complete.
Inserted 8,521 unique triples from 8,555 read triples in 00:00:01.050 at 8.1K triples/sec
Bulk load complete.  Loaded 8,521 triples from 1 file(s) in 00:00:01 @ 8.4K triples/sec.

Successfully created database 'myDB'.

[michele.pasin]@Tartaruga:/Applications/stardog-2.2.2>stardog query myDB "SELECT DISTINCT ?s WHERE { ?s ?p ?o } LIMIT 10"
+--------------------------------------------------------+
|                           s                            |
+--------------------------------------------------------+
| tag:stardog:api:                                       |
| http://www.University0.edu                             |
| http://www.Department0.University0.edu                 |
| http://www.Department0.University0.edu/FullProfessor0  |
| http://www.Department0.University0.edu/Course0         |
| http://www.Department0.University0.edu/GraduateCourse0 |
| http://www.Department0.University0.edu/GraduateCourse1 |
| http://www.University84.edu                            |
| http://www.University875.edu                           |
| http://www.University241.edu                           |
+--------------------------------------------------------+

Query returned 10 results in 00:00:00.061

In the snippet above, I’ve just loaded the test dataset that comes with Stardog into the myDB database, then queried it using the stardog query command.

There’s a fancy user interface too, which can be accessed by going to http://localhost:5820 (note: by default, you can log in with usr/psw = admin).

Stardog1

Stardog2

4. Loading a big dataset

As in my previous post, I’ve tried loading the NPG Articles dataset available at nature.com’s legacy linked data site data.nature.com. The dataset contains around 40M triples describing (at the metadata level) all that’s been published by NPG and Scientific American from 1845 till nowadays. The file size is ~6 gigs so it’s not a huge dataset. Still, something big enough to pose a challenge to my macbook pro (8gigs RAM).

First off, I tried loading the dataset via the command line by passing an extra argument when creating a new database:

[michele.pasin]@Tartaruga:~/Downloads/NPGcitationsGraph/articles.2012-07-16>stardog-admin db create -n npgArticles articles.nq 
Bulk loading data to new database.
Parsing triples: 100% complete in 00:01:48 (10.1M triples - 93.3K triples/sec)
Parsing triples finished in 00:01:48.678
Creating index: 100% complete in 00:00:19 (525.1K triples/sec)
Creating index finished in 00:00:19.311
Computing statistics: 100% complete in 00:00:05 (1748.1K triples/sec)
Computing statistics finished in 00:00:05.782
Loading complete.
Inserted 10,107,653 unique triples from 10,140,000 read triples in 00:02:16.178 at 74.5K triples/sec
Bulk load complete.  Loaded 10,107,653 triples from 1 file(s) in 00:02:16 @ 74.3K triples/sec.

Errors were encountered during loading:
File: /Users/michele.pasin/Downloads/NPGcitationsGraph/articles.2012-07-16/articles.nq Message: '2000-13-01' is not a valid value for datatype http://www.w3.org/2001/XMLSchema#date [line 10144786]
Successfully created database 'npgArticles'.

As you can see, that didn’t work as expected: only 10M out of the 40M triples were loaded, because of an XML parsing error the installer encountered.

After some googling and pinging the mailing list, I discover that Stardog is actually right: the parsing error derives from the fact that valid values for XMLSchema#date are ISO8601 Dates. My data contained an XML date 2000-13-01 which is wrong – that should be 2000-01-13 instead.

What’s interesting is that I’ve previously managed to load the same dataset with other triple stores without any problems. How was that possible?

The online documentation provides the answer:

RDF parsing in Stardog is strict: it requires typed RDF literals to match their explicit datatypes, URIs to be well-formed, etc. In some cases, strict parsing isn’t ideal, so it may be disabled using the –strict-parsing=FALSE to disable it.

Also, from the mailing list:

By default, if you say “1.5”^^xsd:int or “twelve point four”^^xsd:float, Stardog is going to complain.  While it’s perfectly legal to have that in RDF, you can run into trouble later on, particularly when doing query evaluation with filters that would handle those literal values where you will hit the dark corners of the SPARQL spec.

So, the way to load a (partially or potentially broken) dataset without having to worry about it too much is to use the strict.parsing=false flag:

>stardog-admin db create -o strict.parsing=false -n articlesNPG2 articles.nq
Bulk loading data to new database.
Parsing triples: 100% complete in 00:05:55 (39.4M triples - 110.7K triples/sec)
Parsing triples finished in 00:05:55.643
Creating index: 100% complete in 00:01:17 (510.7K triples/sec)
Creating index finished in 00:01:17.122
Computing statistics: 100% complete in 00:00:21 (1789.2K triples/sec)
Computing statistics finished in 00:00:21.944
Loading complete.
Inserted 39,262,620 unique triples from 39,384,548 read triples in 00:07:51.402 at 83.5K triples/sec
Bulk load complete.  Loaded 39,262,620 triples from 1 file(s) in 00:07:51 @ 83.3K triples/sec.

Successfully created database 'articlesNPG2'.

Job done in around 7 minutes!

 

Conclusion:

Extremely easy to install, efficient and packed with advanced features (inferencing and data-checking among the most useful ones imho). Also, as far as the UX and web interface goes, I doubt you can get any better than this with triplestores.

It’s a commercial product, of course, so you may not expect anything less than that. However the community edition (which is free) allows for 10 databases & 25M triples per db – which may be just fine for many projects.

If we had more tools as accessible as this one, I do think rdf triplestores would have a much higher uptake by now!

 

5. Useful resources

> Documentation

  • http://docs.stardog.com/
  • > Mailing list

  • https://groups.google.com/a/clarkparsia.com/forum/#!forum/stardog
  • > Python API

  • If you’re a pythonista, this small library can be useful: https://github.com/knorex/pystardog/wiki
  •  

    ]]>
    2530
    Installing ClioPatria triplestore on mac os http://www.michelepasin.org/blog/2014/10/27/getting-started-with-a-triplestore-on-mac-os-cliopatria/ http://www.michelepasin.org/blog/2014/10/27/getting-started-with-a-triplestore-on-mac-os-cliopatria/#comments Mon, 27 Oct 2014 11:53:48 +0000 http://www.michelepasin.org/blog/?p=2521 ClioPatria is a “SWI-Prolog application that integrates SWI-Prolog’s the SWI-Prolog libraries for RDF and HTTP services into a ready to use (semantic) web server”. It is actively developed by the folks at the VU University of Amsterdam and is freely available online.

    While at a conference last week I saw a pretty cool demo (DIVE) which, I later learned, is powered by the ClioPatria triplestore. So I thought I’d give it a try and by doing so write a follow up on my recent post on installing OWLIM on Mac OS.

    1. Requirements

    OSX: Mavericks 10.9.5
    XCode: latest version available from Apple
    HOMEBREW: ruby -e “$(curl -fsSkL raw.github.com/mxcl/homebrew/go)”
    Prolog: build it from source using brew: brew install swi-prolog
    ClioPatria: git clone https://github.com/ClioPatria/ClioPatria.git

    2. Setting up

    After you have downloaded and unpacked the archive, all you need to do is start a new project using the ClioPatria script. In short, this is done by creating a new directory and telling ClioPatria to configure it as a project:

    [michele.pasin]:~/Documents/ClioPatriaProjects/firstproject> ../path/to/ClioPatria/configure

    A bunch of files are created, including a script run.pl which you can use later to run the server.

    3. Running ClioPatria

    I tried running the run.pl as per documentation but that didn’t work:

    [michele.pasin]@Tartaruga:~/Documents/ClioPatriaProjects/firstproject>./run.pl 
    ./run.pl: line 3: :-: command not found
    ./run.pl: line 5: /Applications: is a directory
    ./run.pl: line 6: This: command not found
    ./run.pl: line 8: syntax error near unexpected token `('
    ./run.pl: line 8: `    % ./configure			(Unix)'
    

    According to a thread on stack overflow, the Prolog shebang line isn’t interpreted correctly by OSx, meaning that Mac OS doesn’t recognise that script as a Prolog program.

    That can be easily solved by calling the Prolog interpreter (swipl) explicitly:

    [michele.pasin]@Tartaruga:~/Documents/ClioPatriaProjects/firstproject>swipl run.pl 
    ERROR: /Applications/-Other-Apps/8-Languages-IDEs/ClioPatria/rdfql/sparql_runtime.pl:1246:14: Syntax error: Operator expected
    % run.pl compiled 1.64 sec, 25,789 clauses
    % Started ClioPatria server at port 3020
    % You may access the server at http://tartaruga.local:3020/
    % Loaded 0 graphs (0 triples) in 0.00 sec. (0% CPU = 0.00 sec.)
    Welcome to SWI-Prolog (Multi-threaded, 64 bits, Version 6.6.6)
    Copyright (c) 1990-2013 University of Amsterdam, VU Amsterdam
    SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software,
    and you are welcome to redistribute it under certain conditions.
    Please visit http://www.swi-prolog.org for details. 

    You should be able to access the server with your browser on port 3020 (ps: the previous command caused a syntax error too, but luckily that isn’t a game stopper).

    Cliopatria

    First impression:

    Super-easy to install, clean and intuitive user interface. I subsequently added a couple of RDF datasets and it all went very very smoothy.

    One cool feature is the fact that ClioPatria has a built-in package management system, which allows you to easily install extensions to the application. For example what follows allows one to quickly extend the UI with a couple of ‘intelligent’ SPARQL query interfaces (Yasque and Flint):

    [michele.pasin]@Tartaruga:/Applications/ClioPatria>sudo git submodule update --init web/yasqe web/yasr
    Password:
    
    
    [michele.pasin]@Tartaruga:/Applications/ClioPatria>sudo git submodule update --init web/FlintSparqlEditor
    

     

    4. Loading a big dataset

    As in my previous post, I’ve tried loading the NPG Articles dataset available at nature.com’s legacy linked data site data.nature.com. The dataset contains around 40M triples describing (at the metadata level) all that’s been published by NPG and Scientific American from 1845 till nowadays. The file size is ~6 gigs so it’s not a huge dataset. Still, something big enough to pose a challenge to my macbook pro (8gigs RAM).

    I used the web UI (‘load local file’) to load the dataset but I quickly ran into a ‘not enough memory’ error. Tried fiddling with the settings accessible via the web interface (Stack limit, Time limit), but that didn’t seem to do much.
    So I increased the memory allocated to the Prolog process (more info here) however this wasn’t enough since after around 20mins the whole thing crashed again due to an out of memory error.

    [michele.pasin]@Tartaruga:~/Documents/ClioPatriaProjects/firstproject>swipl -G6g run.pl

    In the end I got in touch with the ClioPatria creators via the mailing list: in their (incredibly fast) reply they suggested to load the dataset manually using the server Prolog console. You’d do that simply by using the rdf_load command after starting the ClioPatria server (as shown above):

    ?- rdf_load('/Users/michele.pasin/Downloads/NPGcitationsGraph/articles.2012-07-16/articles.nq')
    |    .
    % Parsed "articles.nq" in 1149.71 sec; 0 triples
    

    That worked: the dataset was loaded in around 20 mins. Job done!

    However when I tried to run some queries the application became very slow and ultimately not responding (especially with queries like trying to retrieve all named classes from the graph). I tried restarting the triplestore, and realised that once you do that, ClioPatria begins by re-loading all repositories previously created – which, in the case of my 40M triples repo, would take around 10-15 minutes.

    After restarting the server, queries were a bit faster but in many cases still pretty slowish on my 8G ram laptop.

     

    Conclusion:

    I am sure there are many more things which could be optimised, however I’m no Prolog expert nor I could figure out where to start just based on the online documentation. So I kind of gave up on using it to work on large datasets on my macbook for now.

    On the other hand, I really liked ClioPatria’s intuitive and simple UI, its ease of installation and the fact you can perform operations transparently and interactively via a Prolog-console (assuming you know how to do that).

    All in all, ClioPatria seems to me a really good option if you want to get up and running quickly e.g. in order to prototype linked data applications or explore small to medium-sized RDF datasets (10M triples or so I guess). For bigger datasets, you better equip your mac with a few gigs of extra RAM!

    5. Useful resources

    > Whitepaper with technical analysis

  • http://cliopatria.swi-prolog.org/help/whitepaper.html
  • > Mailing list

  • http://mailman.few.vu.nl/mailman/listinfo/cliopatria-list
  •  

    ]]>
    http://www.michelepasin.org/blog/2014/10/27/getting-started-with-a-triplestore-on-mac-os-cliopatria/feed/ 3 2521
    Installing GraphDB (aka OWLIM) triplestore on mac os http://www.michelepasin.org/blog/2014/10/16/getting-started-with-a-triplestore-on-mac-os-graphdb-aka-owlim/ Thu, 16 Oct 2014 19:05:38 +0000 http://www.michelepasin.org/blog/?p=2507 GraphDB (formerly called OWLIM) is an RDF triplestore which is used – among others – by large organisations like the BBC or the British Museum. I’ve recently installed the LITE release of this graph database on my mac, so what follows is a simple write up of the steps that worked for me.

    Haven’t played much with the database yet, but all in all, the installation was much simpler than expected (ps: this old recipe on google code was very helpful in steering me in the right direction with the whole Tomcat/Java setup).

    1. Requirements

    OSX: Mavericks 10.9.5
    XCode: latest version available from Apple
    HOMEBREW: ruby -e “$(curl -fsSkL raw.github.com/mxcl/homebrew/go)”
    Tomcat7: brew install tomcat
    JAVA: available from Apple

    Finally – we obviously want to get a copy of OWLIM-Lite too: http://www.ontotext.com/owlim/downloads

    2. Setting up

    After you have downloaded and unpacked the archive, you must simply copy these two files:

    owlim-lite/sesame_owlim/openrdf-sesame.war
    owlim-lite/sesame_owlim/openrdf-workbench.war

    ..to the Tomcat webapps folder:

    /usr/local/Cellar/tomcat/7.0.29/libexec/webapps/

    Essentially that’s because OWLIM-Lite is packaged as a storage and inference layer for the Sesame RDF framework, which runs here as a component within the Tomcat server (note: there are other ways to run OWLIM, but this one seemed the quickest).

    3. Starting Tomcat

    First I created a symbolic link in my ~/Library folder, so to better manage new versions (as suggested here).

    sudo ln -s /usr/local/Cellar/tomcat/7.0.39 ~/Library/Tomcat

    Then in order to start/stop Tomcat it’s enough to use the catalina command:

    [michele.pasin]@here:~/Library/Tomcat/bin>./catalina start
    Using CATALINA_BASE:   /usr/local/Cellar/tomcat/7.0.39/libexec
    Using CATALINA_HOME:   /usr/local/Cellar/tomcat/7.0.39/libexec
    Using CATALINA_TMPDIR: /usr/local/Cellar/tomcat/7.0.39/libexec/temp
    Using JRE_HOME:        /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
    Using CLASSPATH:       /usr/local/Cellar/tomcat/7.0.39/libexec/bin/bootstrap.jar:/usr/local/Cellar/tomcat/7.0.39/libexec/bin/tomcat-juli.jar
    
    [michele.pasin]@here:~/Library/Tomcat/bin>./catalina stop
    Using CATALINA_BASE:   /usr/local/Cellar/tomcat/7.0.39/libexec
    Using CATALINA_HOME:   /usr/local/Cellar/tomcat/7.0.39/libexec
    Using CATALINA_TMPDIR: /usr/local/Cellar/tomcat/7.0.39/libexec/temp
    Using JRE_HOME:        /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
    Using CLASSPATH:       /usr/local/Cellar/tomcat/7.0.39/libexec/bin/bootstrap.jar:/usr/local/Cellar/tomcat/7.0.39/libexec/bin/tomcat-juli.jar
    

    Tip: Tomcat runs by default on port 8080. That can be changed pretty easily by modifying a parameter in server.xml in {Tomcat installation folder}/libexec/conf/ more details here.

     

    4. Testing the Graph database

    Start a browser and go to the Workbench Web application using a URL of this form: http://localhost:8080/openrdf-workbench/ (substituting localhost and the 8080 port number as appropriate). You should see something like this:

    SesameWorkbench

    After selecting a server, click ‘New repository’.

    Select ‘OWLIM-Lite’ from the drop-down and enter the repository ID and description. Then click ‘next’.

    Fill in the fields as required and click ‘create’.

    That’s it! A message should be displayed that gives details of the new repository and this should also appear in the repository list (click ‘repositories’ to see this).

    5. Loading a big dataset

    I’ve set out to load the NPG Articles dataset available at nature.com’s legacy linked data site data.nature.com.

    The dataset contains around 40M triples describing (at the metadata level) all that’s been published by NPG and Scientific American from 1845 till nowadays. The file size is ~6 gigs so it’s not a huge dataset. Still, something big enough to pose a challenge to my macbook pro (8gigs RAM).

    First, I increased the memory allocated to the Tomcat application to 5G. It was enough to create a setenv.sh file in the ${tomcat-folder}\bin\ folder. The file contains this line:

    CATALINA_OPTS=”$CATALINA_OPTS -server -Xms5g -Xmx5g”

    More details on Tomcat’s and Java memory issues are available here.

    Then I used OWLIM’s web interface to create a new graph repository and upload the dataset file into it (I previously downloaded a copy of the dataset to my computer so to work with local files only).

    It took around 10 minutes for the application to upload the file into the triplestore, and 2-3 minutes for OWLIM to process it. Much much faster than what I expected. Only minor issue, the lack of notifications (in the UI) of what was going on. Not a big deal in my case, but with larger dataset uploads it might be a potential downer.

    Note: I used the web form to upload the dataset, but there are also ways to do that from the command line (which will probably result in even faster uploads).

    6. Useful information

    > Sparql endpoints

    All of your repositories come also with a handy SPARQL endpoint, which is available at this url: http://localhost:8080/openrdf-sesame/repositories/test1 (just change the last bit so that it matches your repository name).

    > Official documentation

  • https://confluence.ontotext.com/display/GraphDB6
  • > Ontotext’s Q&A forum

  • http://answers.ontotext.com
  •  

    ]]>
    2507
    RDF programming with AllegroCL and AllegroGraph http://www.michelepasin.org/blog/2011/04/14/allegro-cl-graph/ Thu, 14 Apr 2011 14:19:46 +0000 http://www.michelepasin.org/blog/?p=1274 Allegro Common Lisp (wikipedia) is a commercial implementation of the Common Lisp programming language developed by Franz Inc. Allegro CL provides the full ANSI Common Lisp standard – but more interestingly for me, it also provides a very comprehensive suite of tools for semantic web programming. So I decided to give it a go, in what follows I just put together some notes on how to get started quickly.

    Franz Inc. offers a whole bunch of semantic technologies, including:

  • AllegroCL: a Common Lisp implementation (homepage | install | faq | documentation) described as the “most powerful dynamic object-oriented development system available today“. It runs on all major operating systems, and it includes a cross-platform GUI too (which is not always the case for Lisp implementations!).
  • AllegroGraph: (home | docs | install) this is a high-performance, persistent RDF triplestore. It allegedly can “scale to billions of triples while maintaining superior performance” and supports SPARQL, RDFS++, and Prolog reasoning from numerous client applications (python too, as discussed in a previous post).
  • AllegroCache: (home | docs | install) this is a dynamic object caching database system. It allows programmers to work directly with objects as if they were in memory while in fact the object data is always stored persistently. It supports a full transaction model with long and short transactions, and meets the classic ACID requirements for a reliable and robust database.
  • Gruff: (home | docs | install) is an rdf graphical browser that attempts to make data retrieval more pleasant and powerful with a variety of tools for laying out cyclical graphs, displaying tables of properties, managing queries, and building queries as visual diagrams (btw thanks to Matteo for mentioning Gruff to me!)
  • Installing AllegroCL

    The good news is that if you follow the installation instructions for AllegroCL, this will include also AllegroGraph and AllegroCache. Long story short, I was able to get started with this environment surprisingly quickly. The installation on OSX involves only two steps (after filling out a form): getting the GTK+ framework (a cross-platform graphical toolkit) and then downloading the Lisp image. Double-click, install, and et-voila’ you’re done with it.

    Here’s how the Lisp IDE looks like:

    Screen shot 2011 04 14 at 13 46 49

    The IDE includes also an integrated patches-loading tool, which you should run straightaway to get the latest versions of several libraries needed by AllegroCL (wonder whether it’s that easy to install all the standard Lisp packages too..):

    Screen shot 2011 04 14 at 13 31 43

    Finally, in order to run AllegroGraph too it’s necessary to invoke an update command manually, which is easily done:

    CG-USER(3): (SYSTEM.UPDATE:INSTALL-ALLEGROGRAPH)
    Checking available AllegroGraph versions...
    Making temporary directory (/tmp/tempa18825120705a/)
    Retrieving agraph-3.3-acl8.2-macosx86.tgz into temporary directory
    Extracting tar archive /tmp/tempa18825120705a/agraph-3.3-acl8.2-macosx86.tgz
    Installing /Applications/AllegroCL/code/agraph.fasl
    Installing /Applications/AllegroCL/agraph/
    
    *********************
    
    To use the newly downloaded AllegroGraph you can load it by
    evaluating the following forms:
    
      (REQUIRE :AGRAPH)
    
    Ancillary files are located in ``/Applications//AllegroCL/agraph/''.
    

    Installing Gruff

    I also downloaded Gruff and ran it. Again, on OSx the whole process was very very straightforward. I loaded up an ontology stored in an RDF/XML file and here’s the result:

    Screen shot 2011 04 14 at 14 18 56

    Very very impressive – that’s my conclusion. Gruff offers plenty of features for viewing and also editing the rdf graph (check out the video tutorials on the homepage); with a little practice, I think it wouldn’t be that difficult to use it for creating rdf models by hand (eg as an alternative to tools like Protege). The main advantage in my view is that it’s highly interactive: being able to switch very quickly from a graph-view to a tabular one is extremely practical when inspecting or creating an rdf model.

    Time to write some code

    Haven’t been writing Lisp in a while, but since we’ve got till here it would be a shame not to get our hands dirty a little, right?

  • Tip: a nice folk named Mark Watson has written a couple of books that show how to use Allegro Graph, and he’s making them available for free on his website (thanks!). Check them out: Practical Semantic Web and Linked Data Applications, Common Lisp Edition, and Practical Semantic Web and Linked Data Applications, Java, Scala, Clojure, and JRuby Edition
  • I had a quick look at the first book mentioned (“Practical Semantic Web and Linked Data Applications, Common Lisp Edition“), and tried to follow the examples presented. Here’s the result..

    First, let’s load up AllegroGraph from AllegroCL and set up a Lisp reader macro (named ‘!’) that makes it easier to enter URIs and literals:

    CG-USER(2): (REQUIRE :AGRAPH)
    ; Fast loading /Applications/AllegroCL/code/AGRAPH.fasl
    ;   Fast loading
    ;      /Applications/AllegroCL/code/acache-2.1.12.fasl
    AllegroCache version 2.1.12
    ;     Fast loading /Applications/AllegroCL/code/SAX.001
    ;;; Installing sax patch, version 1.
    ;       Fast loading from bundle code/ef-e-anynl.fasl.
    ;         Fast loading from bundle code/ef-e-crlf.fasl.
    ;         Fast loading from bundle code/ef-e-cr.fasl.
    ;   Fast loading from bundle code/streamp.fasl.
    ;   Fast loading /Applications/AllegroCL/code/ACLRPC.fasl
    Loaded patch file /Applications/AllegroCL/update/pim001.001.
    ;   Fast loading /Applications/AllegroCL/code/PROLOG.001
    ;;; Installing prolog patch, version 1.
    ;   Fast loading /Applications/AllegroCL/code/DATETIME.001
    ;;; Installing datetime patch, version 1.
    ;   Fast loading /Applications/AllegroCL/code/streamc.002
    ;;; Installing streamc patch, version 2.
    ;     Fast loading from bundle code/efft-utf8-base.fasl.
    ;     Fast loading from bundle code/efft-void.fasl.
    ;     Fast loading from bundle code/efft-latin1-base.fasl.
    ;   Fast loading from bundle code/streamm.fasl.
    ;   Fast loading from bundle code/ef-e-crcrlf.fasl.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio001.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio002.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio003.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio004.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio005.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio006.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio007.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio008.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio009.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio010.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio011.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio012.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio013.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio014.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio015.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio016.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio017.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio018.001.
    Loaded patch file /Applications/AllegroCL/update/agraph/3.3/pio019.001.
    AllegroGraph Lisp Edition 3.3 [built on February 17, 2010 13:53:35 GMT-0800]
    Copyright (c) 2005-2010 Franz Inc.  All Rights Reserved.
    AllegroGraph contains patent-pending technology.
    With patches: io001, io002, io003, io004, io005, io996, io007, io008, io009, io010, io011, io012, io013, io014, io015, io016, io017, io018, io019.
    T
    CG-USER(3): (in-package :db.agraph.user)
    #<The DB.AGRAPH.USER package>
    TRIPLE-STORE-USER(4): (enable-!-reader)
    #<Function READ-TOKEN>
    T
    TRIPLE-STORE-USER(5): (enable-print-decoded t)
    T
    TRIPLE-STORE-USER(6): (triple-store:display-namespaces)
    rdfs => http://www.w3.org/2000/01/rdf-schema#
    err => http://www.w3.org/2005/xqt-errors#
    fn => http://www.w3.org/2005/xpath-functions#
    rdf => http://www.w3.org/1999/02/22-rdf-syntax-ns#
    xs => http://www.w3.org/2001/XMLSchema#
    xsd => http://www.w3.org/2001/XMLSchema#
    owl => http://www.w3.org/2002/07/owl#
    TRIPLE-STORE-USER(9): !rdfs:class
    !rdfs:class
    

    Next thing, we want to create a local triplestore, register a dummy namespace and add a couple of triples to it. Finally, we dump the whole triplestore in a file.

    TRIPLE-STORE-USER(10): (triple-store:create-triple-store "~/tmp/rdfstore_1")
    #<DB.AGRAPH::TRIPLE-DB /Users/mac/tmp/rdfstore_1, open @ #x21dfc80a>
    TRIPLE-STORE-USER(11): (register-namespace "kb" "http://michelepasin.org/rdfs#")
    "http://michelepasin.org/rdfs#"
    TRIPLE-STORE-USER(12): (triple-store:display-namespaces)
    rdfs => http://www.w3.org/2000/01/rdf-schema#
    err => http://www.w3.org/2005/xqt-errors#
    fn => http://www.w3.org/2005/xpath-functions#
    rdf => http://www.w3.org/1999/02/22-rdf-syntax-ns#
    xs => http://www.w3.org/2001/XMLSchema#
    xsd => http://www.w3.org/2001/XMLSchema#
    owl => http://www.w3.org/2002/07/owl#
    kb => http://michelepasin.org/rdfs#
    TRIPLE-STORE-USER(13): (defvar *doc1* (resource "http://www.michelepasin.org/research/"))
    *DOC1*
    TRIPLE-STORE-USER(14): *doc1*
    !<http://www.michelepasin.org/research/>
    TRIPLE-STORE-USER(15): (triple-store:add-triple *doc1* !rdf:type !kb:article)
    1
    TRIPLE-STORE-USER(16): (triple-store:add-triple *doc1* !rdf:comment !"what a wonderful book")
    2
    TRIPLE-STORE-USER(17): (triple-store:get-triples-list)
    (< type article> < comment what a wonderful book>)
    NIL
    TRIPLE-STORE-USER(18): (print-triples (triple-store:get-triples-list))
    <http://www.michelepasin.org/research/> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://michelepasin.org/rdfs#article> .
    <http://www.michelepasin.org/research/> <http://www.w3.org/1999/02/22-rdf-syntax-ns#comment> "what a wonderful book" .
    TRIPLE-STORE-USER(19): (with-open-file (output "~/tmp/testoutput" :direction :output :if-does-not-exist :create)
    (print-triples (triple-store:get-triples-list) :stream output :format :ntriple))
    

    If we open the contents of the newly created “~/tmp/testoutput” file, here’s what we would find:

    <http://www.michelepasin.org/research/>
        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
            <http://michelepasin.org/rdfs#article> .
    <http://www.michelepasin.org/research/>
        <http://www.w3.org/1999/02/22-rdf-syntax-ns#comment> 
            "what a wonderful book" .
    

    Finally, let’s use SPARQL to launch a (quite dumb) query that retrieves all triples from the triplestore :

    TRIPLE-STORE-USER(53): (sparql:run-sparql 
     " PREFIX kb: <http://www.michelepasin.org/research#> 
       SELECT ?article_uri ?pred ?obj WHERE {
       ?article_uri ?pred ?obj . 
       }"
     )
    <?xml version="1.0"?>
    <!-- Generated by AllegroGraph 3.3 -->
    <sparql xmlns="http://www.w3.org/2005/sparql-results#">
      <head>
        <variable name="article_uri"/>
        <variable name="pred"/>
        <variable name="obj"/>
      </head>
      <results>
        <result>
          <binding name="article_uri">
            <uri>http://www.michelepasin.org/research/</uri>
          </binding>
          <binding name="pred">
            <uri>http://www.w3.org/1999/02/22-rdf-syntax-ns#type</uri>
          </binding>
          <binding name="obj">
            <uri>http://michelepasin.org/rdfs#article</uri>
          </binding>
        </result>
        <result>
          <binding name="article_uri">
            <uri>http://www.michelepasin.org/research/</uri>
          </binding>
          <binding name="pred">
            <uri>http://www.w3.org/1999/02/22-rdf-syntax-ns#comment</uri>
          </binding>
          <binding name="obj">
            <literal>what a wonderful book</literal>
          </binding>
        </result>
      </results>
    </sparql>
    T
    :SELECT
    (|?article_uri| |?pred| |?obj|)
    TRIPLE-STORE-USER(54):
    

    That’s all for now; this is enough to get started with Allegro-CL and the semantic web programming libraries it contains.

    Mind that I’ve only scratched the surface here, there’s lots and lots more that can be done with this environment. If you want to explore these topics further make sure you have a look at Watson’s book, ’cause that’s a great (and free) resource, really!

     

    ]]>
    1274