TY - GEN
T1 - Computer Code Representation through Natural Language Processing for fMRI Data Analysis
AU - Kim, Jaeyoon
AU - O'Reilly, Una May
AU - Seok, Junhee
N1 - Funding Information:
ACKNOWLEDGEMENT This research was supported by the MOTIE (Ministry of Trade, Industry, and Energy) in Korea, under the Fostering Global Talents for Innovative Growth Program (P0008749) supervised by the Korea Institute for Advancement of Technology (KIAT) and National Research Foundation of Korea (NRF-2019R1A2C1084778).
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - There are many attempts to analyze the relationship between functional magnetic resonance imaging (fMRI) data and text stimuli representation in cognitive neuroscience research. Because programming codes are exemplary text stimuli, appropriate code representation for neuroscience research has been actively studied. In this paper, we focus on representing python code for fMRI research through natural language processing (NLP) techniques. We collect 7, 893 python codes of 23 question types from a code competition website and build three different models based on sequence-to-sequence, bag-of-words, and bigram representation. The model is evaluated to classify the types of questions. Finally, the model is applied to classify 108 python codes which were used for a cognitive neuroscience study of fMRI. We are looking forward to analyzing fMRI data with the proposed code representation for understanding how the human brain is active.
AB - There are many attempts to analyze the relationship between functional magnetic resonance imaging (fMRI) data and text stimuli representation in cognitive neuroscience research. Because programming codes are exemplary text stimuli, appropriate code representation for neuroscience research has been actively studied. In this paper, we focus on representing python code for fMRI research through natural language processing (NLP) techniques. We collect 7, 893 python codes of 23 question types from a code competition website and build three different models based on sequence-to-sequence, bag-of-words, and bigram representation. The model is evaluated to classify the types of questions. Finally, the model is applied to classify 108 python codes which were used for a cognitive neuroscience study of fMRI. We are looking forward to analyzing fMRI data with the proposed code representation for understanding how the human brain is active.
KW - bag-of-words
KW - bigram
KW - cognitive neuroscience
KW - computer code representation
KW - Functional magnetic resonance imaging
KW - natural language processing
KW - sequence-to-sequence
UR - http://www.scopus.com/inward/record.url?scp=85127622934&partnerID=8YFLogxK
U2 - 10.1109/ICAIIC54071.2022.9722644
DO - 10.1109/ICAIIC54071.2022.9722644
M3 - Conference contribution
AN - SCOPUS:85127622934
T3 - 4th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2022 - Proceedings
SP - 184
EP - 187
BT - 4th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2022
Y2 - 21 February 2022 through 24 February 2022
ER -