The Kentucky Open Source Society

Main Calendar         September Calendar


September KyOSS: HMM-Pandas

Short title for calendar grid: KyOSS
Date: Wednesday, Sep 12, 2018
Time: 6:30 pm - 8:30 pm
Location: TEK Systems, first-floor conference room, 700 N.Hurstbourne Pkwy. (Steel Technologies Bldg.)

Shane Kimble will be telling us about his statistical project in Python: HMM-Pandas - A sane, tabular representation of the Hidden Markov Model and its algorithms.


Regarding the Hidden Markov Model (HMM) and its algorithms, much of the research, materials, and implementations are mathematically convoluted. This project presents the HMM and its algorithms using Pandas dataframes. The objectives of this project are to demonstrate the HMM using a simple, sane Python implementation; to make the HMM understandable to anyone; to create visually appealing representations; and to improve the performance of existing implementations. In addition, detailed comments are provided throughout the code.



For the Viterbi and Forward-Backward algorithms, a set of observations and hidden states are defined. The observations describe your average programmer and his outward appearance. What is the programmer wearing, eating, or drinking? The hidden states describe how the programmer is feeling. Does the programmer have a high and mighty attitude? When provided a sequence of observations, the '' script will predict the most likely sequence of hidden states for the programmer, while the '' script will calculate the posterior marginals for all hidden states. Need more verbosity? Use the 'adv' versions!

Unlike previous implementations, this project utilizes a vectorized approach towards dynamic programming. Therefore, the scripts run much faster and more efficiently than implementations that loop continuously. The only source of iteration is the sequence of observations.

Getting Started

This project requires few dependences and should be trivial to set up. However, an in-depth understanding of the HMM and its associated algorithms requires some knowledge of probability theory, data science, and dynamic programming.


Occassionally, there may be ties that occur during the dynamic programming process. Pandas selects the upper-most value in the column to determine the tie breaking hidden state.



I'd recommend using these scripts interactively with Jupyter Notebook via VSCode, Atom's Hydrogen, Sublime's Hermes, Pycharm, or your web browser. Spyder and/or IPython will also work. Do not use standard Python.

Event posted by frappyjohn, 8:04 pm Sep 11, 2018
Send private message to frappyjohn
Last edited by frappyjohn, 8:24 pm Sep 11, 2018
2186 Public G: 0

Edit history for this event:
2187 1 Previous 2018-09-11 20:04:53 frappyjohn
2188 2 Previous 2018-09-11 20:08:30 frappyjohn
2189 3 Previous 2018-09-11 20:09:14 frappyjohn
2190 4 Previous 2018-09-11 20:10:21 frappyjohn
2191 5 Previous 2018-09-11 20:11:03 frappyjohn
2186 6 Current 2018-09-11 20:24:28 frappyjohn