Data-driven sequence labeling methods incorporating the long-range spatial variation of geological data for lithofacies sequence estimation

Gyeong Tae Park, Jina Jeong, Irina Emelyanova, Marina Pervukhina, Lionel Esteban, Seong Taek Yun

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)


The use of geophysical well-log data for interpreting the stratigraphic lithofacies sequence is cost effective. In this study, several data-driven lithofacies sequence estimation models are developed, where long- and short-term memory (LSTM) and bidirectional LSTM (BLSTM) are applied to efficiently complement the long-range spatial variation of successive well-log and lithofacies measurements. During the development, the models using the autoregressive (AR) input variables of lithofacies are designed to incorporate the lithofacies sequence pattern into the estimation. The performances of the proposed models are comparatively validated with an artificial deep neural network (DNN)-based model that does not consider long-range variation. Accordingly, a total of six estimation models are examined: DNN, AR-DNN, LSTM, AR-LSTM, BLSTM, and AR-BLSTM. For model implementation, synthetic data and actual data acquired from the Satyr-5 well in Western Australia are used. For the synthetic data, the results indicate that the incorporation of nonstationary statistical information improves the performance of BLSTM-based models. In addition, AR-input information is effective with respect to the estimation of the vertical thickness of lithofacies. The advantage of using AR inputs can also be observed for actual data, where AR-based models perform significantly better than the other models. Quantitatively speaking, the fitness of DNN-, LSTM-, and BLSTM-based models is 81.79 %, 84.53 %, and 85.08 %, respectively, whereas that of AR-DNN-, AR-LSTM-, and AR-BLSTM-based models is 85.43 %, 85.72 %, and 86.56 %, respectively. The proposed models are expected to be useful with respect to interpreting the heterogeneity of lithofacies distribution in a cost-effective and computationally efficient way. Particularly, BLSTM-based models are widely applicable because they perform well regardless the spatial statistics of lithofacies sequences.

Original languageEnglish
Article number109345
JournalJournal of Petroleum Science and Engineering
Publication statusPublished - 2022 Jan


  • A long-range spatial information
  • Autoregressive lithofacies
  • Bidirectional long- and short-term memory
  • Geophysical well-log data
  • Lithofacies sequence estimation
  • Long- and short-term memory

ASJC Scopus subject areas

  • Fuel Technology
  • Geotechnical Engineering and Engineering Geology


Dive into the research topics of 'Data-driven sequence labeling methods incorporating the long-range spatial variation of geological data for lithofacies sequence estimation'. Together they form a unique fingerprint.

Cite this