Abstrato

Development of Prediction Tool for Drought Tolerant Protein in Rice Using Machine Learning Algorithm

Annapoorna Shetty, Hemalatha N , Mohammed Moideen Shihab , Brendon Victor Fernandes

Machine learning deals with the construction and study of systems that can learn from data, rather than follow only explicitly programmed instructions. Classification is the problem of identifying to which of a set of protein categories a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known based on positive and negative datasets. This paper primarily emphasizes on the development of prediction tool for drought tolerant protein NAC in rice using Support Vector Machine algorithm. In this paper, we have used seven feature extraction methods including amino acid features, dipeptide, hybrid methods and exchange group features. Using dipeptide features, we have obtained a precision rate of 86% for the NACPredictor tool. This is also further compared with sequence similarity search tool PSI-BLAST.