BL-LDA: Bringing bigram to supervised topic model

Youngsun Park, Md Hijbul Alam, Woo Jong Ryu, Sang-Geun Lee

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    2 Citations (Scopus)

    Abstract

    With the increasing amount of data being published on the Web, it is difficult to analyze their content within a short time. Topic modeling techniques can summarize textual data that contains several topics. Both the label (such as category or tag) and word co-occurrence play a significant role in understanding textual data. However, many conventional topic modeling techniques are limited to the bag-of-words assumption. In this paper, we develop a probabilistic model called Bigram Labeled Latent Dirichlet Allocation (BL-LDA), to address the limitation of the bag-of-words assumption. The proposed BL-LDA incorporates the bigram into the Labeled LDA (L-LDA) technique. Extensive experiments on Yelp data show that the proposed scheme is better than the L-LDA in terms of accuracy.

    Original languageEnglish
    Title of host publicationProceedings - 2015 International Conference on Computational Science and Computational Intelligence, CSCI 2015
    EditorsQuoc-Nam Tran, Leonidas Deligiannidis, Hamid R. Arabnia
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages83-88
    Number of pages6
    ISBN (Electronic)9781467397957
    DOIs
    Publication statusPublished - 2016 Mar 2
    EventInternational Conference on Computational Science and Computational Intelligence, CSCI 2015 - Las Vegas, United States
    Duration: 2015 Dec 72015 Dec 9

    Publication series

    NameProceedings - 2015 International Conference on Computational Science and Computational Intelligence, CSCI 2015

    Other

    OtherInternational Conference on Computational Science and Computational Intelligence, CSCI 2015
    Country/TerritoryUnited States
    CityLas Vegas
    Period15/12/715/12/9

    Bibliographical note

    Publisher Copyright:
    © 2015 IEEE.

    Keywords

    • Data Analysis
    • Data Mining
    • Text Classification
    • Topic Modeling

    ASJC Scopus subject areas

    • Computational Theory and Mathematics
    • Artificial Intelligence
    • Computer Networks and Communications
    • Hardware and Architecture
    • Signal Processing

    Fingerprint

    Dive into the research topics of 'BL-LDA: Bringing bigram to supervised topic model'. Together they form a unique fingerprint.

    Cite this