Abstract
Patents provide inventors exclusive rights to their inventions by protecting their intellectual property rights. However, analyzing patent documents generally requires knowledge of various fields, considerable human labor, and expertise. Recent studies to alleviate this problem on patent analysis deal only with the analysis of claims and abstract parts, neglecting the descriptions that contain essential technical cores. Moreover, few studies use a deep learning approach to handle the entire patent analysis process, including preprocessing, summarization, and key-phrase generation. Therefore, we propose a novel multi-stage framework that can aid in analyzing patent documents by using the description part of the patent rather than abstracts or claims with deep learning. The framework comprises two stages: key-sentence extraction and key-phrase generation tasks. These stages are based on the T5 model structure, transformer-based architecture that uses a text-to-text approach. To further improve the framework's performance, we employed two key factors: i) post-training the model with a patent-related raw corpus for encouraging the model's comprehension of the patent domain, and ii) utilizing a text rank algorithm for efficient training based on the priority score of each sentence. We verified that our key-phrase generation method of the framework shows higher performance in both superficial and semantic evaluation than other extraction methods. In addition, we provided the validity and effectiveness of our methods through quantitative and qualitative analysis, demonstrating the practical functionality of our methods. We also provided a practical contribution to the patent analysis by releasing the framework as a demo system.
Original language | English |
---|---|
Pages (from-to) | 59205-59218 |
Number of pages | 14 |
Journal | IEEE Access |
Volume | 10 |
DOIs | |
Publication status | Published - 2022 |
Bibliographical note
Funding Information:This work was supported in part by the Ministry of Science and ICT (MSIT), South Korea, under the Information Technology Research Center (ITRC) Support Program, Supervised by the Institute for Information and Communications Technology Planning and Evaluation (IITP), under Grant IITP-2018-0-01405; in part by the Institute of Information and Communications Technology Planning and Evaluation (IITP) Grant through the Korea Government (MSIT), a Neural-Symbolic Model for Knowledge Acquisition and Inference Techniques, under Grant 2020-0-00368; and in part by the Basic Science Research Program through the National Research Foundation of Korea (NRF), Ministry of Education, under Grant NRF-2021R1A6A1A03045425.
Publisher Copyright:
© 2013 IEEE.
Keywords
- Deep learning
- Key-sentence extraction
- Keyword extraction
- Patent
- Patent analysis
- Post training
ASJC Scopus subject areas
- Computer Science(all)
- Materials Science(all)
- Engineering(all)
- Electrical and Electronic Engineering