Formal language theory (FLT), part of the broader mathematical theory of computation, provides a systematic terminology and set of conventions for describing rules and the structures they generate, along with a rich body of discoveries and theorems concerning generative rule systems. Despite its name, FLT is not limited to human language, but is equally applicable to computer programs, music, visual patterns, animal vocalizations, RNA structure and even dance. In the last decade, this theory has been profitably used to frame hypotheses and to design brain imaging and animal-learning experiments, mostly using the ‘artificial grammar-learning’ paradigm. We offer a brief, non-technical introduction to FLT and then a more detailed analysis of empirical research based on this theory. We suggest that progress has been hampered by a pervasive conflation of distinct issues, including hierarchy, dependency, complexity and recursion. We offer clarifications of several relevant hypotheses and the experimental designs necessary to test them. We finally review the recent brain imaging literature, using formal languages, identifying areas of convergence and outstanding debates. We conclude that FLT has much to offer scientists who are interested in rigorous empirical investigations of human cognition from a neuroscientific and comparative perspective.