Call Us On : +91 9873543020    [email protected]

Feature selection based on a normalized difference measure for text classification

Feature selection based on a normalized difference measure for text classification

Feature selection based on a normalized difference measure for text classification

 

a b s t r a c t

The goal of feature selection in text classification is to choose highly distinguishing fea- tures for improving the performance of a classifier. The well-known text classification fea- ture selection metric named balanced accuracy measure (ACC2) (Forman, 2003) evaluates a term by taking the difference of its document frequency in the positive class (also known as true positives) and its document frequency in the negative class (also known as false positives). This however results in assigning equal ranks to terms having equal difference, ignoring their relative document frequencies in the classes. In this paper we propose a new feature ranking (FR) metric, called normalized difference measure (NDM), which takes into account the relative document frequencies. The performance of NDM is investigated against seven well known feature ranking metrics including odds ratio (OR), chi squared (CHI), information gain (IG), distinguishing feature selector (DFS), gini index (GINI) ,bal- anced accuracy measure (ACC2) and Poisson ratio (POIS) on seven datasets namely We- bACE(WAP,K1a,K1b), Reuters (RE0, RE1),spam email dataset and 20 newsgroups using the multinomial naive Bayes (MNB) and supports vector machines (SVM) classifiers. Our re- sults show that the NDM metric outperforms the seven metrics in 66% cases in terms of macro-F1 measure and in 51% cases in terms of micro F1 measure in our experimental trials on these datasets.

 

Image result for Feature selection based on a normalized difference measure for text classification


What we provide:

  • Complete Research Assistance

Technology Involved:-

  • MATLAB, Simulink, MATPOWER, GRIDLAB-D,OpenDSS, ETAP, GAMS

Deliverables:- 

  • Complete Code of this paper
  • Complete Code of the approach to be propose
  • A document containing complete explanation of code and research approach
  • All materials used for this research
  • Solution to all your queries related to your work
Close Menu