RNN is the implementation of a statistical modeling technique that is now positively influencing the perception of NLP applications. Certain flaws in RNNs such as vanishing gradients were rectified and this brought forth LSTMs (Long Short-Term Memory). A variation of RNNs, LSTMs are pushing the boundaries of statistical modeling in the NLP domain.
Another significant breakthrough in the field of NLP deep learning is Word Embeddings. This is a technique where words represent real-valued vectors in multi-dimensional space. The similarity between words in a given vector space translates to closeness in the meaning/context of the usage of words in natural language. Word vectors or word embeddings as they are referred to are directly used as a statistical parameter input to the Neural Networks.
The remainder of this post discusses deep learning applications in NLP that have made significant strides, some of their core challenges, and where they stand today.
A chatbot is a computer program that simulates a human-like conversation with the user of the program. Three famous examples of these programs are, Apple’s Siri, Google Assistant, and Amazon Alexa. While consumer applications of chatbots are more talked about, enterprises see specific value in finely honed chatbots. Industries that see heavy customer service demand such as Telecom and Retail have a particular interest in successfully automating this critical process leading to improving profitability and cost reductions.
Simplistically, a Chatbot program processes a natural language conversational dialogue from the user and returns an appropriate response. All the while it keeps the grammatical and lexical rules of the natural language intact.
Chatbots have plenty of applications in fields ranging from Oil & Gas to Medical Research. However, chatbots must train with conversational data and more importantly domain expertise. Contextualizing this domain expertise is referred to as Ontology, where words and phrases sensitive to the industry are put together.
Next trained word vectors are built. These are words with vectors that share common contexts. In the vector space, these sit close to each other. Several pre-trained word-vector libraries exist such as Word2Vec and GloVe.
However, based on the application and domain that the chatbot caters to, developers can always create their own library. Finally using the dataset and the generated/pre-trained word vectors encode the Sequence-to-Sequence model to train your chatbot.
Sentiment Analysis is one of the oldest and as yet unresolved problems in the NLP universe. It refers to the process of understanding the emotional context of any content – text, video, and audio.
Human emotion is complex. We use several forms of emotions in conjunction with tonality to convey a specific feeling. Sarcasm, slangs, and inherent ambiguity in using positive and negative words, negation, and multipolarity make sentiment analysis challenging.
A few of these challenges can be resolved using Word Embedding solutions. These are a set of natural language modeling and feature learning techniques under the overall NLP umbrella. Some of these solutions with deep learning architecture are reasonably accurate, especially, social media slangs, neutral sentiment, and compound sentiments as well.
Word embeddings also, however, suffer from meaning conflation deficiency where all possible meanings of a word associate with a single vector. This defeats the very purpose it is meant to solve.
A progression towards Word Senses is now being discussed as a possible solution to this challenge. Each different meaning of the same word is a ‘sense’, and a lexical listing of these senses can further refine NLP models that capture semantics.
Question and Answering
A Question and Answering system delivers precise and short question-specific answers. In reality, a Q&A system is an information retrieval system that automatically answers questions asked by users in natural language format with grammatical and lexical rules intact using a pre-analyzed database or a set of natural language documents digested and stored in a structured way.
Q&A is a hard problem to solve, and certain paradigms about it need to be understood:
- Knowledgebase – Structure the data source using NLP machine learning techniques in such a way that it allows retrieval of answers.
- Answer generation – Apply NLP techniques to extract answers from the retrieved snippet
- Question processing – Apply NLP techniques to discern the topic and entities of the question and generate a query that can be fired on the pre-stored structured database
- Information retrieval – Retrieve and rank answer snippets based on the queries
Document summarization or more aptly text summarization refers to the task of extracting the most critical information in a given large text abstract to generate a short, meaningful synopsis of the same.
There are many ways in which a text summarizer is useful as you can imagine both for direct consumption for humans as well as for computer programs and algorithms who need to process a large amount of data in a short time.
This problem can be approached in several ways, each one attuned to the manner of the output. In most deep learning methods one would work on the sentences and vectors created out of those sentences. These vectors would then train models to generate a fewer set of sentences from a larger set.
As is evident, the main challenges revolve around the effective compression not losing any secondary yet important knowledge and it not be possible to develop a one size fits all solution. This essentially means that depending upon the application domain we must infuse some amount of domain knowledge into the model.
Speech recognition is fundamentally understanding human speech. More technically, we try to decompose the acoustic sounds generated by humans to convert it accurately into natural language utterances. These are a meaningful and grammatically correct sequence of words that form sentences spoken by the user.
Speech recognition has several applications in the real world such as virtual assistants, home automation, video games, etc. Convolutional Neural Networks often power speech recognition applications and are of significant importance. Sequence-to-sequence models are useful for generating a meaningful sequence of words.
The difficulties involving any speech recognition module are most of the time acoustic in nature. By that, we mean factors such as environmental noise, accents, speaking rate, sociolinguistics, etc.
Most people think that speech is words with distinct sounds to them. In reality, speech is unbounded by words or any kind of unit. There is a stable sequence of states where words and their correlation with phonetics are well defined.
However, other states are dynamically changing (such as the same words on different devices such as mobile phones) and have different phonetics attached to them. This makes them indeterministic, and this is where the problem lies.
Today’s speech recognition models use an acoustic model for each word, a phonetic dictionary that records the phones for each word, and a language model that predicts with reasonable accuracy the words that come before and after a specific word.
Machine translation at a high level means translating text from one language to another. In computational linguistics Machine translation essentially brings to a computer program that takes as input a sequence of words in a source language (not necessarily well-formed meaningful sentences every time) and converts it into a semantically similar sequence of words in the target language.
Word Embeddings in the source and target languages along with Recurrent Neural Networks and sequence-to-sequence models commonly use deep learning algorithms to solve challenges in Machine Translation
However, machine translation algorithms still face several challenges. Accurate success from these algorithms still remains forbidden fruit. Quality issues mainly stem from the contextual awareness of the models and different types of grammar and lexical rules that languages contain.
Regional dialects and usage of vocabulary differ even though the base language remains the same. While machine translation has made significant strides since the days of GNMT, one of its considerable challenges in AI. AI is just that – artificial – it cannot recognize multiple contexts of the same word. Dialects, emotions, and tonality all vary the contextual meaning of a specific word and algorithms are not at a level where they can understand this just yet.
These challenges have rendered machine translation algorithms as support software, at best, for human translators. In many ways, language is a creative art form. Asking machine translation algorithms to understand this creative thinking is still some time away