Imagined speech is spotlighted as a new trend in the brain-machine interface due to its application as an intuitive communication tool. However, previous studies have shown low classification performance, therefore its use in real-life is not feasible. In addition, no suitable method to analyze it has been found. Recently, deep learning algorithms have been applied to this paradigm. However, due to the small amount of data, the increase in classification performance is limited. To tackle these issues, in this study, we proposed an end-to-end framework using Siamese neural network encoder, which learns the discriminant features by considering the distance between classes. The imagined words (e.g., arriba (up), abajo (down), derecha (right), izquierda (left), adelante (forward), and atrás (backward)) were classified using the raw electroencephalography (EEG) signals. We obtained a 6-class classification accuracy of 31.40 ± 2.73% for imagined speech, which significantly outperformed other methods. This was possible because the Siamese neural network, which increases the distance between dissimilar samples while decreasing the distance between similar samples, was used. In this regard, our method can learn discriminant features from a small dataset. The proposed framework would help to increase the classification performance of imagined speech for a small amount of data and implement an intuitive communication system.