Sign Up Today
☀️ 🌙

Arch-Engineer

Types Make You Not Type

Research Corner: Type-Driven Synthesis

It's been said that "types make you type." That is, languages that require type specifiers tend to have longer lines than ones that don't. Steve Yegge once lampooned this tendency by showing how a common variable declaration in Java is long enough to be a haiku.

It's also been said of languages with more expressive type systems "if it typechecks, it probably works." This happens particularly with pure functions with polymorphic types, where there are so few things that can be done with each input that sometimes the only reasonable-looking function that can be written is the desired one. So, can you avoid typing together and just generate the function from the type signature?

There have been a few attempts at this sort of thing before. First, jungloid mining came along, generating long chains of function calls to help Enterprise Java programmers writing FizzBuzz go from an IntegerStringReturnerFactory to an ApplicationContextHolder. Then SyPet arrived, which could output snippets accepting not one but multiple inputs in order to achieve the desired output. Finally, Hoogle+ extended the SyPet algorithm to polymorphic types, automatically generating Haskell programs from their type signatures. But it was too slow.

9 years ago, I set out to create a new way of building tools that modify code in an entire program with as few unnecessary changes as possible. I discovered an ingenious approach that didn't work. In 2017, I failed again. In 2021, I came up with an even more sophisticated approach and failed again. But I did succeed at finding a way to make Hoogle+ much faster.

Equality-constrained tree automata, or ECTAs, are a new technique I developed which allow you to create a small data structure representing 10^100 or more programs, and find the one you want out of all of those. It's not the first technique that can do this. But it is the first that can do it while imposing constraints on the programs. In this case, we can set up an ECTA representing all well-typed Haskell programs with given inputs and a target output, click "run," and we have a working synthesizer, 8x faster than Hoogle+ despite having 1/10th as much code.

ECTA visualization
ECTA examples

Thanks to my collaborator Matthías Gissurarson, Hectare is now a downloadable tool integrated with the Haskell compiler, and will be available as an optional extension in the next version of the Haskell Language Server.

Hectare demonstration

(Perhaps it's too fast; its filtering could still be improved.)

Recently, we went on the Haskell Interlude to discuss Hectare, as well as ECTAs in general, and Mattías's work on automated bug-fixing in Haskell.

Our Haskell Interlude Appearance

I'm pretty excited by all of this. ECTAs let you take many program synthesis problems and instantly solve them. Mattías and I have also written a paper (in submission) on using them to automatically generate property-based tests. And I still think they'll work for my original goal of building tools that can rewrite entire programs without unnecessary changes, but I'd need to go back to full-time academia to find out.

If you're not a Haskell programmer, what lessons can we take from all this?

It's been said that types reduce the number of wrong programs you can write without ruling out the correct one. Now we can count exactly how many programs there are with and without the types. Maybe we'll build something that can quantify the benefit of other kinds of constraints as well. Or maybe you will?

The ECTA and Hectare paper
My talk on ECTAs at ICFP '22

Happy coding,
Jimmy Koppel