Thursday, November 16, 2006

Building and Managing a Massive Triple Store: An Experience Report

The aim of the Ingenta MetaStore project is to build a flexible and scalable repository for the storage of bibliographic metadata spanning 17 million articles and 20,000 publications.
The repository replaces several existing data stores and will act as a focal point for integration of a number of existing applications and future projects. Scalability, replication and robustness were important considerations in the repository design.
After introducing the benefits of using RDF as the data model for this repository, the paper will focus on the practical challenges involved in creating and managing a very large triple store.
The repository currently contains over 200 million triples from a range of vocabularies including FOAF, Dublin Core and PRISM.
The challenges faced range from schema design, data loading, SPARQL query performance. Load testing of the repository provided some insights into the tuning of SPARQL queries.
The paper will introduce the solutions developed to meet these challenges with the goal of helping others seeking to deploy a large triple store in a production environment. The paper will also suggest some avenues for further research and development.

http://xtech06.usefulinc.com/schedule/paper/18

0 Comments:

Post a Comment

<< Home