TY - GEN
T1 - Reusing of information constructed in HTML documents
T2 - 2008 International Conference on Control, Automation and Systems, ICCAS 2008
AU - Hwangbo, Hoon
AU - Lee, Hongchul
PY - 2008
Y1 - 2008
N2 - There have been efforts of making a knowledge based web, represented by Semantic Web. However, in this trend, HTML is not appropriate as a language for ontology and a structure of information. Due to numerous amounts of information in it, it seems rational to reuse those data in HTML. Previous studies are not enough to broadly convert HTML into OWL because they mainly focus on conversions of structured data (table tags), and they just give simple executions. In addition, GRDDL, a recommendation of W3C, needs an additional script for a conversion, and the output format of it is RDF which has some restrictions. This paper will offer three steps of conversions; (1) Extraction of information, (2) Acquiring triples, (3) Constructing ontology. There are two types of information; text-formed and non-text-formed information. In addition, there are two kinds of tags which include only text-formed information or which include both of text-formed and non-text-formed one. Depending on the type of tags, we classify tag categories and set rules for each of them. Using those rules, we can make triples, and finally we can construct ontology.
AB - There have been efforts of making a knowledge based web, represented by Semantic Web. However, in this trend, HTML is not appropriate as a language for ontology and a structure of information. Due to numerous amounts of information in it, it seems rational to reuse those data in HTML. Previous studies are not enough to broadly convert HTML into OWL because they mainly focus on conversions of structured data (table tags), and they just give simple executions. In addition, GRDDL, a recommendation of W3C, needs an additional script for a conversion, and the output format of it is RDF which has some restrictions. This paper will offer three steps of conversions; (1) Extraction of information, (2) Acquiring triples, (3) Constructing ontology. There are two types of information; text-formed and non-text-formed information. In addition, there are two kinds of tags which include only text-formed information or which include both of text-formed and non-text-formed one. Depending on the type of tags, we classify tag categories and set rules for each of them. Using those rules, we can make triples, and finally we can construct ontology.
KW - Analyzing system of english grammar
KW - Conversion
KW - Data extraction
KW - HTML
KW - OWL
KW - Reusing information
UR - http://www.scopus.com/inward/record.url?scp=58149101999&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=58149101999&partnerID=8YFLogxK
U2 - 10.1109/ICCAS.2008.4694654
DO - 10.1109/ICCAS.2008.4694654
M3 - Conference contribution
AN - SCOPUS:58149101999
SN - 9788995003893
T3 - 2008 International Conference on Control, Automation and Systems, ICCAS 2008
SP - 871
EP - 875
BT - 2008 International Conference on Control, Automation and Systems, ICCAS 2008
Y2 - 14 October 2008 through 17 October 2008
ER -