Large amounts of operational data are routinely collected, stored away in the archives of many domains. This operational information can be processed and utilized for further analysis by various Data mining approaches. Recently, several techniques have been developed, used for early prediction of any disease. More significantly, Diabetes is a chronic metabolic disorder which affects the metabolism process of the body to maintain the glucose. T2DM is one type of diabetes caused due to insulin disorder. In this process, first micro array experiment data is collected from NCBI repository. Then, implement the modified sparse K-means clustering, PCAGA-RFR methods for grouping of clusters and feature set reduction to identify gene most relevant to T2DM among number of genes with sequence of steps: 1. Collection of Data from Repositories; 2. Implementation of Modified Sparse K-means method to select T2DM relevant gene; 3. Implementation of Hybrid PCAGA-RFR method to select T2DM relevantgene; 4. Confirmation of TCF7L2 Gene; 5. Identifying different positional changes for rs7903146 common Variant of TCF7L2 gene.