Secretory proteins of P. falciparum owing to the complex nature of parasite. With the completion of Plasmodium genome sequence, it is both challenging and urgent to develop an automatic method or high throughput tool for identifying secretory proteins of P. falciparum. Actually, some efforts have been made in this regard. In a pioneer study, Verma et al. [2] proposed a method for identifying proteins secreted by malaria parasite. In their prediction method, the operation engine was the Support Vector Machine (SVM)Predicting Secretory Proteins of Malaria Parasitewhile the protein samples were formulated with the amino acid composition, dipeptide composition, and position specific scoring Galantamine manufacturer matrix (PSSM) [3]. Subsequently, Zuo and Li [4] introduced the K-minimum increment of diversity (K-MID) approach to predict secretory proteins of malaria parasite based on grouping of amino acids. Meanwhile, various studies around this topic were also carried out [5,6,7,8,9]. In the past, various predictors for protein systems were developed by incorporating the evolutionary information via PSSM [10,11,12,13,14,15,16,17,18,19,20]. In the above papers, however, only the statistical information of PSSM [3] was utilized but the inner interactions among the constituent amino acid residues in a protein sample, or its sequence-order effects, were ignored. To avoid completely lose the sequence-order information associated with PSSM, the concept of pseudo amino acid composition (PseAAC) [21,22] was utilized to incorporate the evolutionary information into the formulation of a protein sample, as done in predicting protein subcellular localization [23,24,25], predicting protein fold pattern [26], identifying membrane proteins and their types [27], predicting enzyme functional classes and subclasses [28], identifying protein quaternary structural attribute [29], predicting antibacterial peptides [30], predicting allergenic proteins [31], and identifying proteases and their types [32]. The present study was initiated in an attempt to develop a new and more powerful predictor for identifying the secretory proteins of malaria parasite by incorporating the sequence evolution information into PseAAC via a grey system model [33]. According to a recent review [34], to establish a really useful statistical predictor for a protein system, we need to consider the following procedures: (i) construct or select a valid benchmark dataset to train and test the predictor; (ii) formulate the protein samples with an effective mathematical expression that can truly reflect their intrinsic correlation with the target to be predicted; (iii) introduce or develop a powerful algorithm (or engine) to operate the prediction; (iv) properly perform cross-validation tests to objectively evaluate the anticipated accuracy of the predictor; (v) establish a user-friendly web-server for the predictor that is RG7666 site accessible to the public. Below, let us describe how to deal with these steps.correlation with the target to be predicted [34]. To realize this, the pseudo amino acid composition (PseAAC) was proposed [21] to replace the simple amino acid composition (AAC) for representing the sample of a protein. Ever since the concept of PseAAC was introduced in 2001 [21], it has penetrated into almost all the fields of protein attribute predictions, such as predicting protein submitochondrial localization [35], predicting protein 1407003 structural class [36], predicting DNA-binding proteins [37], identifying bacter.Secretory proteins of P. falciparum owing to the complex nature of parasite. With the completion of Plasmodium genome sequence, it is both challenging and urgent to develop an automatic method or high throughput tool for identifying secretory proteins of P. falciparum. Actually, some efforts have been made in this regard. In a pioneer study, Verma et al. [2] proposed a method for identifying proteins secreted by malaria parasite. In their prediction method, the operation engine was the Support Vector Machine (SVM)Predicting Secretory Proteins of Malaria Parasitewhile the protein samples were formulated with the amino acid composition, dipeptide composition, and position specific scoring matrix (PSSM) [3]. Subsequently, Zuo and Li [4] introduced the K-minimum increment of diversity (K-MID) approach to predict secretory proteins of malaria parasite based on grouping of amino acids. Meanwhile, various studies around this topic were also carried out [5,6,7,8,9]. In the past, various predictors for protein systems were developed by incorporating the evolutionary information via PSSM [10,11,12,13,14,15,16,17,18,19,20]. In the above papers, however, only the statistical information of PSSM [3] was utilized but the inner interactions among the constituent amino acid residues in a protein sample, or its sequence-order effects, were ignored. To avoid completely lose the sequence-order information associated with PSSM, the concept of pseudo amino acid composition (PseAAC) [21,22] was utilized to incorporate the evolutionary information into the formulation of a protein sample, as done in predicting protein subcellular localization [23,24,25], predicting protein fold pattern [26], identifying membrane proteins and their types [27], predicting enzyme functional classes and subclasses [28], identifying protein quaternary structural attribute [29], predicting antibacterial peptides [30], predicting allergenic proteins [31], and identifying proteases and their types [32]. The present study was initiated in an attempt to develop a new and more powerful predictor for identifying the secretory proteins of malaria parasite by incorporating the sequence evolution information into PseAAC via a grey system model [33]. According to a recent review [34], to establish a really useful statistical predictor for a protein system, we need to consider the following procedures: (i) construct or select a valid benchmark dataset to train and test the predictor; (ii) formulate the protein samples with an effective mathematical expression that can truly reflect their intrinsic correlation with the target to be predicted; (iii) introduce or develop a powerful algorithm (or engine) to operate the prediction; (iv) properly perform cross-validation tests to objectively evaluate the anticipated accuracy of the predictor; (v) establish a user-friendly web-server for the predictor that is accessible to the public. Below, let us describe how to deal with these steps.correlation with the target to be predicted [34]. To realize this, the pseudo amino acid composition (PseAAC) was proposed [21] to replace the simple amino acid composition (AAC) for representing the sample of a protein. Ever since the concept of PseAAC was introduced in 2001 [21], it has penetrated into almost all the fields of protein attribute predictions, such as predicting protein submitochondrial localization [35], predicting protein 1407003 structural class [36], predicting DNA-binding proteins [37], identifying bacter.