TY - GEN
T1 - SPIDER
T2 - ACM 18th International Conference on Information and Knowledge Management, CIKM 2009
AU - Choi, Hyunsik
AU - Son, Jihoon
AU - Cho, Yonghyun
AU - Sung, Min Kyoung
AU - Chung, Yon Dohn
PY - 2009
Y1 - 2009
N2 - RDF is a data model for representing labeled directed graphs, and it is used as an important building block of semantic web. Due to its flexibility and applicability, RDF has been used in applications, such as semantic web, bioinformatics, and social networks. In these applications, large-scale graph datasets are very common. However, existing techniques are not effectively managing them. In this paper, we present a scalable, efficient query processing system for RDF data, named SPIDER, based on the well-known parallel/distributed computing framework, Hadoop. SPIDER consists of two major modules (1) the graph data loader, (2) the graph query processor. The loader analyzes and dissects the RDF data and places parts of data over multiple servers. The query processor parses the user query and distributes sub queries to cluster nodes. Also, the results of sub queries from multiple servers are gathered (and refined if necessary) and delivered to the user. Both modules utilize the MapReduce framework of Hadoop. In addition, our system supports some features of SPARQL query language. This prototype will be foundation to develop real applications with large-scale RDF graph data.
AB - RDF is a data model for representing labeled directed graphs, and it is used as an important building block of semantic web. Due to its flexibility and applicability, RDF has been used in applications, such as semantic web, bioinformatics, and social networks. In these applications, large-scale graph datasets are very common. However, existing techniques are not effectively managing them. In this paper, we present a scalable, efficient query processing system for RDF data, named SPIDER, based on the well-known parallel/distributed computing framework, Hadoop. SPIDER consists of two major modules (1) the graph data loader, (2) the graph query processor. The loader analyzes and dissects the RDF data and places parts of data over multiple servers. The query processor parses the user query and distributes sub queries to cluster nodes. Also, the results of sub queries from multiple servers are gathered (and refined if necessary) and delivered to the user. Both modules utilize the MapReduce framework of Hadoop. In addition, our system supports some features of SPARQL query language. This prototype will be foundation to develop real applications with large-scale RDF graph data.
KW - Distributed
KW - RDF
KW - Semantic web
KW - Triple store
UR - http://www.scopus.com/inward/record.url?scp=74549174073&partnerID=8YFLogxK
U2 - 10.1145/1645953.1646315
DO - 10.1145/1645953.1646315
M3 - Conference contribution
AN - SCOPUS:74549174073
SN - 9781605585123
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 2087
EP - 2088
BT - ACM 18th International Conference on Information and Knowledge Management, CIKM 2009
Y2 - 2 November 2009 through 6 November 2009
ER -