A Hybrid model for Missing Data Imputation
Abstract
Data mining is also known as the procedure of mining useful knowledge from large amount of data. This procedure found its application in various field like in making Business strategy, Market analysis, advancing medical treatments etc. But in order to do this deal this analysis, data scientist have to deal with real world dataset which consist of noisy, inconsistent as well as missing data. Thus the presence of such missing data can give rise to invalid and inaccurate decisions in knowledge extraction. The aim of this paper is to propose a new methodology in dealing with missing values. The methodology is named it as Exponential Clustering technique as this methodology is hybridization of clustering and exponential prediction of data and applied on Pima Indians Type II Diabetes dataset to analyze the performance with the existing techniques. The performance measured in this methodology is analyzed better than that of the existing technique of clustering.
Keywords: Pima dataset, missing data, imputation, exponential, clustering