© 2020 Strange Loop
Did you recently face a combinatorial optimization problem? These problems come in many guises, from toy problems like the money-changing problem and solving the word game "boggle" to real world problems in text search, pattern matching, route finding, and biological sequence analysis.
Simply combining the multiple cases to recursively solve each subproblem leads to the dreaded "combinatorial explosion" and a gigantic search space. The space grows exponentially with the input length, and finding an optimal solution is considered intractable, even on a powerful compute cluster.
Dynamic programming is a powerful technique to tame these problems. It reduces complexity by reusing intermediate results that are stored in a table, instead of computing them again. The only requirement is that the problem fulfills "Bellman's principle of optimality", which states that it is okay to push the optimization function down into the recursive computation. We can tabulate optimal intermediate results without losing solutions we need later. With this trick, the complexity drops to merely filling the dynamic programming tables, and exponentially hard problems become polynomial.
Dynamic programming algorithms are usually expressed in the form of recursive formulas, recurrences. They contain subscripts to express details of table access via indices, index fiddling to navigate to the correct previous solutions, and a mix of tabulated and non-tabulated solutions.
According to the cases of the recurrences, we have to decide what to tabulate, the tables need to be filled in the correct order, and we need to know where our final result ends up in the table. To print out the construction of a candidate that yields the optimal solution, we have to backtrack in the table and in each step take care of all recurrence cases again.
Developing and implementing such an algorithm can be tedious, error-prone and the resulting programs can be hard to debug.
We will explore a formal framework for specifying dynamic programming algorithms on sequences in an abstract and modular way. Using tree grammars and evaluation algebras, we don't need subscripts to access intermediate results. By separating the problems of search space description, candidate description, candidate evaluation and tabulation, it helps us in thinking about and formalizing dynamic programming algorithms.
Bellman's GAP is a programming language derived from that formalism, which allows us to describe the problem in abstract building blocks, and yet get efficient programs. The compiler of the language handles tabulation, access of intermediate results and backtracking. We can focus on the fun part, the algorithm itself, and equip it with different evaluation strategies and scoring schemes.
I will introduce concepts and language based on the real-world application to molecular structure prediction, and show how they can help craft a solution for your favorite dynamic programming problem.