Hidden Markov Model & it’s applications

5 min readApr 4, 2022

Details about how HMM model works and it’s applications is real life.

A Hidden Markov Model (HMM) is a statistical model which is also used in machine learning. It can be used to describe the evolution of observable events that depend on internal factors, which are not directly observable. These are a class of probabilistic graphical models that allow us to predict a sequence of unknown variables from a set of observed variables. We will discuss the Hidden Markov Models in detail. We will understand the contexts where it can be used and we will also discuss its different applications.

Before delving into what the Hidden Markov Model is, let’s understand the Markov Chain.

Markov Chain

A Markov Chain is a model or a type of random process that explains the probabilities of sequences of random variables, commonly known as states. Each of the states can take values from some set. In other words, we can explain it as the probability of being in a state, which depends on the previous state. We use the Markov Chain when we need to calculate the probability for a sequence of observable events. However, in most cases, the chain is hidden or invisible, and each state randomly generates 1 out of every k observations visible to us. Now, we will define the Hidden Markov Model.

DEFINITION OF HIDDEN MARKOV MODEL (HMM)

The Hidden Markov Model (HMM) is an analytical Model where the system being modeled is considered a Markov process with hidden or unobserved states. Machine learning and pattern recognition applications, like gesture recognition & speech handwriting, are applications of the Hidden Markov Model.

HMM, Hidden Markov Model enables us to speak about observed or visible events and hidden events in our probabilistic model.

Hidden Markov Model With an Example

To explain it more we can take the example of two friends, Rahul and Ashok. Now Rahul completes his daily life works according to the weather conditions. Major three activities completed by Rahul are- go jogging, go to the office, and cleaning his residence. What Rahul is doing today depends on whether and whatever Rahul does he tells Ashok and Ashok has no proper information about the weather But Ashok can assume the weather condition according to Rahul work.

Ashok believes that the weather operates as a discrete Markov chain, wherein the chain there are only two states whether the weather is Rainy or it is sunny. The condition of the weather cannot be observed by Ashok, here the conditions of the weather are hidden from Ashok. On each day, there is a certain chance that Bob will perform one activity from the set of the following activities {“jog”, “work”,” clean”}, which are depending on the weather. Since Rahul tells Ashok that what he has done, those are the observations. The entire system is that of a hidden Markov model (HMM).

Here we can say that the parameter of HMM is known to Ashok because he has general information about the weather and he also knows what Rahul likes to do on average.

So let’s consider a day where Rahul called Ashok and told him that he has cleaned his residence. In that scenario, Ashok will have a belief that there are more chances of a rainy day and we can say that belief Ashok has is the start probability of HMM let’s say which is like the following.

The states and observation are:

states = ('Rainy', 'Sunny')

observations = ('walk', 'shop', 'clean')

And the start probability is:

start_probability = {'Rainy': 0.6, 'Sunny': 0.4}

Now the distribution of the probability has the weightage more on the rainy day stateside so we can say there will be more chances for a day to being rainy again and the probabilities for next day weather states are as following

transition_probability = {   'Rainy' : {'Rainy': 0.7, 'Sunny': 0.3},
   'Sunny' : {'Rainy': 0.4, 'Sunny': 0.6},   }

From the above we can say the changes in the probability for a day is transition probabilities and according to the transition probability the emitted results for the probability of work that Rahul will perform is

emission_probability = {   'Rainy' : {'jog': 0.1, 'work': 0.4, 'clean': 0.5},
   'Sunny' : {'jog': 0.6, 'work: 0.3, 'clean': 0.1},   }

This probability can be considered as the emission probability. Using the emission probability Ashok can predict the states of the weather or using the transition probabilities Ashok can predict the work which Rahul is going to perform the next day.

Below image shown the HMM process for making probabilities:

So here from the above intuition and the example we can understand how we can use this probabilistic model to make a prediction.

HIDDEN MARKOV MODEL ADVANTAGES AND DISADVANTAGES

Advantages

HMM is an analyzed probabilistic graphical model. The algorithms applied in this model are studied for approximate learning and conclusion.
Hidden Markov Models (HMM) are said to acquire the contingency between successive measurements, as defined in the switch continuity principle.
HMMs represent the variance of appliances’ power demands via probability distributions.

Disadvantages

HMM cannot represent any dependency between the appliances. The conditional HMM can capture the dependencies, though.
HMM does not consider the state sequence dominating any given state because of its Markovian nature.
HMMs do not explicitly capture the time in a specified state due to their Markovian behavior. Nonetheless, the hidden semi-Markov model is responsible for capturing that kind of behavior.

Application of Hidden Markov Model

An application, where HMM is used, aims to recover the data sequence where the next sequence of the data can not be observed immediately but the next data depends on the old sequences. Taking the above intuition into account the HMM can be used in the following applications:

Computational finance
speed analysis
Speech recognition
Speech synthesis
Part-of-speech tagging
Document separation in scanning solutions
Machine translation
Handwriting recognition
Time series analysis
Activity recognition
Sequence classification
Transportation forecasting

Hidden Markov Models in NLP

From the above application of HMM, we can understand that the applications where the HMM can be used have sequential data like time series data, audio, and video data, and text data or NLP data. In this article, our main focus is on those applications of NLP where we can use the HMM for better performance of the model, and here in the above-given list, we can see that one of the applications of the HMM is that we can use it in the Part-of-Speech tagging. Next in the article, we will see how we can use the HMM for POS-tagging.

What is POS-tagging?.

Part-of-speech (POS) tagging is a popular Natural Language Processing process which refers to In corpus linguistics, part-of-speech tagging, also called grammatical tagging is the process of marking up a word in a text as corresponding to a particular part of speech, based on both its definition and its context.

We can conclude and summarize the following points for the HMM as discussed in the above sections including, what is Hidden Markov Model (HMM), where is the Hidden Markov Model used, and others.

The data visible to us is the observational data and not the data fetched from the states.
Using the Forward Algorithm, we can find the conditional distribution over the hidden states.
Using the Viterbi Algorithm, we can find the sequence of hidden states in the form of a Viterbi path.
The forward and the backward phase formulas in the Baum-Welch algorithm reveal the expected hidden states with the help of the given observed data.

Thank you!!