Fast and Robust Wrapper Method for N-gram Feature Template Induction in Structured Prediction

Fast and Robust Wrapper Method for N-gram Feature Template Induction in Structured Prediction

Fast and Robust Wrapper Method for N-gram Feature Template Induction in Structured Prediction

 

ABSTRACT

N-gram feature templates that consider consecutive contextual information comprise a family of important feature templates used in structured prediction. Some previous studies considered the n-gram feature selection problem but they focused on one or several types of features in certain tasks, e.g., consecutive words in a text categorization task. In this paper, we propose a fast and robust bottomup wrapper method for automatically inducing n-gram feature templates, which can induce any type of n-gram feature for any structured prediction task. According to the signicance distribution for n-gram feature templates based on the n-gram and bias (offset), the proposed method rst determines the n-gram
that achieves the best tradeoff between the severity of the sparse data problem with n-gram feature templates and the richness of the corresponding contextual information, before combining the best n-gram with lower-order gram templates in an extremely efcient manner. In addition, our method uses a template pair, i.e., the two symmetrical templates, rather than a template as the basic unit (i.e., including or excluding a template pair rather than a template). Thus, when the data in the training set change slightly, our method is
robust to this uctuation, thereby providing a more consistent induction result compared with the template based method. The experimental results obtained for three tasks, i.e., Chinese word segmentation, named entity recognition, and text chunking, demonstrated the effectiveness, efciency, and robustness of the proposed method.

 

Related image


What we provide:

  • Complete Research Assistance

Technology Involved:-

  • MATLAB, Simulink, MATPOWER, GRIDLAB-D,OpenDSS, ETAP, GAMS

Deliverables:-  

  • Complete Code of this paper
  • Complete Code of the approach to be propose
  • A document containing complete explanation of code and research approach
  • All materials used for this research
  • Solution to all your queries related to your work