Integrated survival model for predicting patent litigation hazard

Youngho Kim, Sangsung Park, Junseok Lee, Dongsik Jang, Jiho Kang

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)


Patent litigation occurs when a company’s product or service violates the scope of another company’s patent rights. When they occur, companies suffer a disruption to the sales of their products and services, thus hindering the sustainability of their business activities. For this reason, companies have established and analyzed wide-ranging strategies to prevent patent litigation. Of those, statistical and machine learning-based quantitative methods using patent big data have several advantages, such as a reduced cost and objective results. Existing quantitative methods analyze patent information and litigation based on the time of data collection. However, the values of patents and their litigation hazards change over time. In addition, the existing methods do not take into account censored data; that is, patents that may result in litigation after the data is collected. In this paper, to solve this problem we propose an integrated survival model that considers censored data and predicts patent litigation hazards over time. The proposed model is a non-parametric survival analysis method based on a random survival forest. It uses pre-trained word2vec and clustering to effectively reflect the technology fields as well as the quantitative information of the patent. The word2vec is a technique for natural language processing and enables the use of patent text information. In order to examine the practicality of the integrated survival model, an experiment is conducted with patent big data related to sensor semiconductors based on AI technology applicable to robotics. In the experiment, it was found that the litigation hazard occurred 150 months after the patent application and increase rapidly from 200 months. Furthermore, the proposed model showed better predictive performance than other survival analysis models. The proposed model could be used by potential defendants to protect their patents.

Original languageEnglish
Article number1763
Pages (from-to)1-15
Number of pages15
JournalSustainability (Switzerland)
Issue number4
Publication statusPublished - 2021 Feb 2

Bibliographical note

Funding Information:
Funding: This research was supported by the MOTIE (Ministry of Trade, Industry, and Energy) in Korea, under the Fostering Global Talents for Innovative Growth Program (P0008749) supervised by the Korea Institute for Advancement of Technology (KIAT).

Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.


  • Patent big data
  • Patent litigation
  • Random survival forest
  • Survival analysis
  • Text mining

ASJC Scopus subject areas

  • Geography, Planning and Development
  • Renewable Energy, Sustainability and the Environment
  • Environmental Science (miscellaneous)
  • Energy Engineering and Power Technology
  • Management, Monitoring, Policy and Law


Dive into the research topics of 'Integrated survival model for predicting patent litigation hazard'. Together they form a unique fingerprint.

Cite this