Private Record Matching for Unstructured Data Using Hybrid Approach

S. Rama Krishnan*
Department of Information Technology, Kongu Engineering College, Erode, Tamil Nadu, India.
Periodicity:April - June'2017
DOI : https://doi.org/10.26634/jse.11.4.13818

Abstract

Record Matching is the task of identifying records which are present across different databases. Some approximate techniques should be available to match the records if unique identifiers are not present. The record matching consists of mainly two approaches, such as sanitization approach and cryptographic approach. The sanitization technique mainly relies on methods, such as K-anonymization and random noise addition. The cryptographic technique uses Secure Multiparty Computation (SMC). It will provide accurate results, but the cost involved will be very high. The hybrid technique combines both the sanitization and cryptographic approaches. The major advantages of combining these approaches are it uses effective decision rule that matches the input values. Based on this, the privacy provided will be high. But the cost involved will be very high and is not effective. Hence the proposed method uses the two party protocols which involve record matching only between the data parties. There will be no trusted third party involved and the secure matching is done by using Binning method.

Keywords

Privacy Technologies, Scalability, Similarity Measure, Binning, Two Party Protocol

How to Cite this Article?

Krishnan, R. S. (2017). Private Record Matching for Unstructured Data Using Hybrid Approach. i-manager’s Journal on Software Engineering, 11(4), 24-29. https://doi.org/10.26634/jse.11.4.13818

References

[1]. Elmagarmid, A. K., Ipeirotis, P. G., & Verykios, V. S. (2007). Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering, 19(1), 1-16.
[2]. Al-Lawati, A., Lee, D., & McDaniel, P. (2005, June). Blocking-aware private record linkage. In Proceedings of the 2nd International Workshop on Information Quality in Information Systems (pp. 59-68). ACM.
[3]. Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., & Zhu, M. Y. (2002). Tools for privacy preserving distributed data mining. ACM SIGKDD Explorations Newsletter, 4(2), 28-34.
[4]. Dwork, C. (2006). Differential Privacy. Proc. Int'l Colloquium Automata, Languages and Programming (ICALP '02) (pp. 1-12).
[5]. Inan, A., Kantarcioglu, M., Ghinita, G., & Bertino, E. (2012). A hybrid approach to private record matching. IEEE Transactions on Dependable and Secure Computing, 9(5), 684-698.
[6]. LeFevre, K., DeWitt, D. J., & Ramakrishnan, R. (2005, June). Incognito: Efficient full-Figure 4. No. of Quasi Identifiers vs. Blocking Efficiency domain k-anonymity. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (pp. 49-60). ACM.
[7]. LeFevre, K., DeWitt, D. J., & Ramakrishnan, R. (2006, April). Mondrian multidimensional k-anonymity. In Data nd Engineering, 2006. ICDE'06. Proceedings of the 22 International Conference on (pp. 25-25). IEEE.
[8]. Kantarcioglu, M., Inan, A., Jiang, W., & Malin, B. (2009). Formal anonymity models for efficient privacypreserving joins. Data & Knowledge Engineering, 68(11), 1206-1223.
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Online 15 15

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.