Parallel huge matrix multiplication on a cluster with GPGPU accelerators

Seungyo Ryu, Dong Seung Kim

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    4 Citations (Scopus)

    Abstract

    We design a parallel huge matrix multiplication algorithm on a cluster of GPU nodes. Since input matrices are too big to accommodate in the memory, the algorithm repeats the loading, computing, storing partial matrix data from/to disk and GPU buffer. The key to achieve the best speedup is not only to use GPU with full performance, but to reduce the overhead in data movement between disk and GPU buffer. We devise an efficient way to lower the latency of supplying the matching pair of the partial matrices to the GPU buffer, and to optimize the data partition, distribution, and disk access using the pipelined way. Experimental results show our algorithm outperforms a generic algorithm, resulting in the computing time reduction by 45%. Also, the scalability of the algorithm enhances with more GPU nodes.

    Original languageEnglish
    Title of host publicationProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages877-882
    Number of pages6
    ISBN (Print)9781538655559
    DOIs
    Publication statusPublished - 2018 Aug 3
    Event32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018 - Vancouver, Canada
    Duration: 2018 May 212018 May 25

    Other

    Other32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
    Country/TerritoryCanada
    CityVancouver
    Period18/5/2118/5/25

    Keywords

    • GPU computing
    • Matrix multiplication
    • MPI
    • Parallel computing

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Computer Networks and Communications
    • Hardware and Architecture
    • Information Systems and Management

    Fingerprint

    Dive into the research topics of 'Parallel huge matrix multiplication on a cluster with GPGPU accelerators'. Together they form a unique fingerprint.

    Cite this