Pages

Wednesday, March 31, 2010

Natural Language Generation and Information Extraction: Sisters by chance

Many natural language processing applications use Natural language generation(NLG) and Information Extraction(IE) together.
  1. Natural language generation is used for generation of natural language sentence from given bag of words.
  2. Information Extraction is used extraction of information to fill up template from given text. Many people get confused between the data-extraction and information extraction. Typical information extraction problem is the extraction of information from news.  For example  one want to fill up score card automatically from cricket commentary text. In this problem, score card is the information extraction template used. Score card has definite structure, like batman name,  run scored, no of sixes, no of fours. 
  • Information Extraction as tagging Problem:- Information extraction can be seen as tagging problem where tags are fields of template of  Information extraction. The fields mostly composed of named entity recognition. This is the reason why most of information extraction system (Stanford NER and GATE ANNIE) are the extension of named entity recognition system.   
  • Information Extraction as relation extraction:- Information extraction is not simply Named entity recognition system, it also extracts the relation which exists between the entities. The relation can be implicit or explicit. The anaphora resolution comes under this category, as anaphora resolution represents relation between the reference and entities. The relation can be ontology or domain specific. 
 They are used together in many natural language processing applications. Some of the applications are
  1. Abstractive Summarization:- Summarization of two basically two types,  extractive summarization and abstractive Summarization. Extractive summerization extracts the important sentences, paragraphs from the sentences. Whereas abstractive summerization generates the summary. It uses information extraction to fill up the template and uses Natural language generation to generate the summery. 
  2. Question answering system:- Many Question answering system use IE and NLG system together. Question answering system can have following module
  • Question understanding Module 
  • Information retrieval system to collect text. 
  • Information extraction system to extract the possible candidate answer. 
  • Generation of answer from  the template
For example general question types are who,whom,when and others. You can extracts the information and give back the answer.
      3.   Spoken dialog Manager:-You may have seen Spoken Dialog Manager  in many customer care service. They understand your question and give answer interactively. You can divide spoken dialog manager in three parts, 1)Automatic speech recognition, 2)Interactive question answering system 3)speech synthesis.
As it has interactive question answering system in its pipeline so it uses Information extraction and natural language Generation together.
    4. Machine Translation:- Machine Translation is one of the applications where Natural language Generation system is used alone. Machine Translation has two basic part structural translation and  lexical translation.  structural translation module is used to translate target language order from source language order. Most of Machine Translation system use lexical translation module after structural translation module. They use bilingual parameters and rules for structural translation. Anyways, if one is doing structural translation after lexical translation then he can use Natural language generation for structural translation.

    No comments:

    Post a Comment