ConceptVector: Text Visual Analytics via Interactive Lexicon Building Using Word Embedding

Deokgun Park, Seungyeon Kim, Jurim Lee, Jaegul Choo, Nicholas Diakopoulos, Niklas Elmqvist

    Research output: Contribution to journalArticlepeer-review

    72 Citations (Scopus)

    Abstract

    Central to many text analysis methods is the notion of a concept: a set of semantically related keywords characterizing a specific object, phenomenon, or theme. Advances in word embedding allow building a concept from a small set of seed terms. However, naive application of such techniques may result in false positive errors because of the polysemy of natural language. To mitigate this problem, we present a visual analytics system called ConceptVector that guides a user in building such concepts and then using them to analyze documents. Document-analysis case studies with real-world datasets demonstrate the fine-grained analysis provided by ConceptVector. To support the elaborate modeling of concepts, we introduce a bipolar concept model and support for specifying irrelevant words. We validate the interactive lexicon building interface by a user study and expert reviews. Quantitative evaluation shows that the bipolar lexicon generated with our methods is comparable to human-generated ones.

    Original languageEnglish
    Article number8023823
    Pages (from-to)361-370
    Number of pages10
    JournalIEEE Transactions on Visualization and Computer Graphics
    Volume24
    Issue number1
    DOIs
    Publication statusPublished - 2018 Jan

    Bibliographical note

    Publisher Copyright:
    © 1995-2012 IEEE.

    Keywords

    • Text analytics
    • concepts
    • text classification
    • text summarization
    • visual analytics
    • word embedding

    ASJC Scopus subject areas

    • Software
    • Signal Processing
    • Computer Vision and Pattern Recognition
    • Computer Graphics and Computer-Aided Design

    Fingerprint

    Dive into the research topics of 'ConceptVector: Text Visual Analytics via Interactive Lexicon Building Using Word Embedding'. Together they form a unique fingerprint.

    Cite this