TY - GEN
T1 - What are the optimum quasi-identifiers to re-identify medical records?
AU - Lee, Yong Ju
AU - Lee, Kyung Ho
N1 - Publisher Copyright:
© 2018 Global IT Research Institute (GiRI).
PY - 2018/3/23
Y1 - 2018/3/23
N2 - Recently, medical records are shared to online for a purpose of medical research and expert opinion. There is a problem with sharing the medical records. If someone knows the subject of the record by using various methods, it can result in an invasion of the patient's privacy. To solve the problem, it is important to carefully address the tradeoff between data sharing and privacy. For this reason, de-identification techniques are applicable to address the problem. However, de-identified data has a risk of re-identification. There are two problems with using de-identification techniques. First, de-identification techniques may damage data utility although it may decrease a risk of re-identification. Second, de-identified data can be re-identified from inference using background knowledge. The objective of this paper is to analyze the probability of re-identification according to inferable quasi-identifiers. We analyzed factors, inferable quasi-identifiers, which can be inferred from background knowledge. Then, we estimated the probability of re-identification from taking advantage of the factors. As a result, we determined the effect of the re-identification according to the type and the range of inferable quasi-identifiers. This paper contributes to a decision on de-identification target and level for protecting patient's privacy through a comparative analysis of the probability of re-identification according to the type and the range of inference.
AB - Recently, medical records are shared to online for a purpose of medical research and expert opinion. There is a problem with sharing the medical records. If someone knows the subject of the record by using various methods, it can result in an invasion of the patient's privacy. To solve the problem, it is important to carefully address the tradeoff between data sharing and privacy. For this reason, de-identification techniques are applicable to address the problem. However, de-identified data has a risk of re-identification. There are two problems with using de-identification techniques. First, de-identification techniques may damage data utility although it may decrease a risk of re-identification. Second, de-identified data can be re-identified from inference using background knowledge. The objective of this paper is to analyze the probability of re-identification according to inferable quasi-identifiers. We analyzed factors, inferable quasi-identifiers, which can be inferred from background knowledge. Then, we estimated the probability of re-identification from taking advantage of the factors. As a result, we determined the effect of the re-identification according to the type and the range of inferable quasi-identifiers. This paper contributes to a decision on de-identification target and level for protecting patient's privacy through a comparative analysis of the probability of re-identification according to the type and the range of inference.
KW - De-identification
KW - Medical records
KW - Privacy
KW - Re-identification
UR - http://www.scopus.com/inward/record.url?scp=85046813819&partnerID=8YFLogxK
U2 - 10.23919/ICACT.2018.8323926
DO - 10.23919/ICACT.2018.8323926
M3 - Conference contribution
AN - SCOPUS:85046813819
T3 - International Conference on Advanced Communication Technology, ICACT
SP - 1025
EP - 1033
BT - IEEE 20th International Conference on Advanced Communication Technology
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 20th IEEE International Conference on Advanced Communication Technology, ICACT 2018
Y2 - 11 February 2018 through 14 February 2018
ER -