Approximating the smallest grammar: Kolmogorov complexity in natural models
We consider the problem of finding the smallest context-free grammar that generates exactly one given string of length n. The size of this grammar is of theoretical interest as an efficiently computable variant of Kolmogorov complexity. The problem is of practical importance in areas such as data compression and pattern extraction.The smallest grammar is known to be hard to approximate to within a constant factor, and an o(logn/log logn) approximation would require progress on a long-standing algebraic problem . Previously, the best proved approximation ratio was O(n1/2) for the Bisection algorithm . Our main result is an exponential improvement of this ratio; we give an O(log (n/g*)) approximation algorithm, where g* is the size of the smallest grammar.We then consider other computable variants of Kolomogorov complexity. In particular we give an O(log2n) approximation for the smallest non-deterministic finite automaton with advice that produces a given string. We also apply our techniques to "advice-grammars" and "edit-grammars", two other natural models of string complexity.