• Divakar V

CTC Loss (part-2) | Forward pass using alpha matrix

Updated: Jan 3, 2021

In this post we will see how CTC Loss is efficiently calculated using Dynamic Programming. We do that by creating a 2D matrix, known as alpha matrix.


2 things for alpha matrix:

  1. Time stamp

  2. State (prefix 1:s)


I felt it would be easier to illustrate the algorithm through a video than a writing a detailed post for each step. So, with high hopes, my original plan was to create a presentation video illustrating the algorithm step by step. However, it took me just first 4 slides before I got bogged down into getting that perfect recording at each step. So, change of plans... not putting out a video this time.


I enjoyed the process of structuring my understanding of the concept into a ppt. and tried my best to go through each step slowly with a corresponding visual for it. The best way to use it would be to first go through the original research paper and develop a basic understanding. Since, I am not covering the concept in detail I would suggest to also go through the references mentioned at the bottom of this post.


You can find the presentation below which illustrates the process of alpha matrix creation and how I understood it. You can directly jump to slide-9 for a walkthrough of an example string.




References

  1. Research Paper

  2. https://distill.pub/2017/ctc/#d-footnote-2

  3. https://sid2697.github.io/Blog_Sid/algorithm/2019/10/19/CTC-Loss.html

  4. https://towardsdatascience.com/intuitively-understanding-connectionist-temporal-classification-3797e43a86c