|
Natural Language Processing
Introduction
The use of computers to
study language dates back to when computers first became available in the 1940s.
The attraction to develop a computer program that understands human speech
propels the research still today. Many companies are vying to become the first
to develop such a technology that would enable communication between people and
computers without resorting to memorization of complex commands and procedures
and enable scientists, business people and common folks to interact easily with
people around the world. The attraction of such possible systems drives
commercial research into Natural Language Processing (NLP). Despite the many years of research and the
millions of dollars spent, the complexity of human cognition, human language and
its' uses, transformed
into a program capable of computational understanding, still eludes us today.
Inhibitory Factors
The difficulties of NLP lay in the ambiguity of
natural languages (English, Spanish, French...). Basically, ambiguity
magnifies
the number of possible interpretations of natural language of which the computer
has yet to duplicate. When we as humans process language, we
are continually attempting to interpret meaning using our robust knowledge of the
world and of the current culture of the communicator, so that we can try and
decipher what message is being communicated. Our ambiguity may comprise 90% of
our communication, and the cognitive ability to interpret ambiguity, although
present in humans, is still beyond a computer’s comprehension
(Winograd and Flores 1995). Take
for example, the combinations of results from multiplying up each individual
ambiguity. Suppose each word in a 15 word sentence could have 2 interpretations.
The number of interpretations of the whole sentence is going to be:
2*2*2*2*2*2*2*2*2*2*2*2*2*2*2 = 32,768
and yet, this number does not consider syntactic, semantic or pragmatic ambiguity, which would significantly increase the actual number of possible interpretations (Inman 1997).
NLP is the field of study which addresses these issues and is leading scientists to focus their attention towards computational modeling, design and development of a wide varieties of systems that lead to human/computer communication. Since most of human communication is either in a written or spoken form, development efforts in NLP must first provide computers with the ability to recognize and understand these utterances. The development of such systems that can understand the above forms of language, leads to providing natural language interfaces to databases, computers; providing tools for linguistic research; machine translation, optical character recognition, speech to text and text to speech conversion, etc (Technology Development for Indian Languages).
Potential
IT Products and Services
Multi-lingual Dictionaries, Thesauruses Educational Software,
Encyclopedia Creative Writing System, Translation Support Systems,
Optical Character Recognition
(OCR), Text-to-Speech & Speech Recognition System, Pocket Translator, Personal
Digital Assistants, Reading machine for visually and hearing impaired, e-governance /
e-commerce / e-skills.
Benefits
The benefits associated with NLP are numerous. A few examples of these
benefits are dramatically reduced cost, time and labor associated with data
recognition for a business, automation of a wide range of manual processes
increasing worker productivity and improving customer service, instantaneous
language translation for those traveling in foreign countries, the ability of
foreign heads of state to communicate without the aid of human translators,
providing a platform for hearing impaired to speak, the visually impaired to
command a computer to complete office tasks such as filing and sorting, and so
on.
Helpful Links
Natural Language. Contains definitions, a very good overview and description of NLP
NLP Tutorials. A useful website into the problems faced with NLP. Including a simple Prolog parser to analyse the structure of language.
Natural Language Processing: She Needs Something Old & Something New (maybe something borrowed and something blue, too.) A look at where we started, where we are and where we are headed with NLP.
Ambiguous Words. An good article on ambiguous words, how to calculate, what the problem is and how the computer deals with them.
Introduction to NLP. A good background on NLP.
Natural Language Processing. Lecture Notes from John Batali's course in Artificial Intelligence Modeling.
Glossary of Linguistic Terms. Contains several definitions pertaining to language.
Alice, the Chat Robot. Cool demo.
AI on the Web: Natural Language Processing. A page of links to reference material, people, research groups, journals, books, organizations, software, companies and much more.
Natural Language Processing. Microsoft's research on NLP.