Reusing of information constructed in HTML documents: A conversion of HTML into OWL

Hoon Hwangbo, Hongchul Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

There have been efforts of making a knowledge based web, represented by Semantic Web. However, in this trend, HTML is not appropriate as a language for ontology and a structure of information. Due to numerous amounts of information in it, it seems rational to reuse those data in HTML. Previous studies are not enough to broadly convert HTML into OWL because they mainly focus on conversions of structured data (table tags), and they just give simple executions. In addition, GRDDL, a recommendation of W3C, needs an additional script for a conversion, and the output format of it is RDF which has some restrictions. This paper will offer three steps of conversions; (1) Extraction of information, (2) Acquiring triples, (3) Constructing ontology. There are two types of information; text-formed and non-text-formed information. In addition, there are two kinds of tags which include only text-formed information or which include both of text-formed and non-text-formed one. Depending on the type of tags, we classify tag categories and set rules for each of them. Using those rules, we can make triples, and finally we can construct ontology.

Original languageEnglish
Title of host publication2008 International Conference on Control, Automation and Systems, ICCAS 2008
Pages871-875
Number of pages5
DOIs
Publication statusPublished - 2008
Event2008 International Conference on Control, Automation and Systems, ICCAS 2008 - Seoul, Korea, Republic of
Duration: 2008 Oct 142008 Oct 17

Publication series

Name2008 International Conference on Control, Automation and Systems, ICCAS 2008

Other

Other2008 International Conference on Control, Automation and Systems, ICCAS 2008
Country/TerritoryKorea, Republic of
CitySeoul
Period08/10/1408/10/17

Keywords

  • Analyzing system of english grammar
  • Conversion
  • Data extraction
  • HTML
  • OWL
  • Reusing information

ASJC Scopus subject areas

  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'Reusing of information constructed in HTML documents: A conversion of HTML into OWL'. Together they form a unique fingerprint.

Cite this