Sooner or later, everything ends up in Microsoft Excel.
The 37-year-old spreadsheet has been used to run Doom and Pac-Man, stop-motion animation, a turn-based role playing game, chess, and a neural network, among other things.
Excel’s latest trick comes courtesy of Microsoft’s own software developers: “FLAME: A small language model for spreadsheet formulas.”
It’s detailed in a preprint paper from Microsoft researchers Harshit Joshi, Abishai Ebenezer, José Cambronero, Sumit Gulwani, Aditya Kanade, Vu Le, Ivan Radiček, and Gust Verbruggen. The paper describes an assistive AI system called FLAME. It’s a small language model that can improve the creation and maintenance of Excel formulas.
Large language models like OpenAI’s ChatGPT are all the rage at the moment. These are statistical models trained on vast amounts of text that can predict a likely output based on a text prompt input.
The problem with large language models is that they’re, well, large – training requires lots of input data and money, and using the resulting model for inference also demands a lot of hardware. For example, the researchers cite Incoder 6.7B, a model trained for code infilling on 159GB of source code over a period of 24 days with 248 Nvidia V100 GPUs.
Lambda Labs has estimated the cost to train GPT-3, a 175B parameter model, comes to about $4.6 million using Tesla V100 instances.
Weighing in at a mere 60M parameters, FLAME is “the first language model designed exclusively for Excel formulas.” While the research paper does not explicitly state that FLAME is an acronym representing “First LAnguage Model for Excel,” we speculate that this is the case.
Despite its modest size, FLAME manages to outperform much larger models tuned for completing lines of code (code infilling), including CodeT5 (220M), Codex-Cushman (12B), and Codex-Davinci (175B).
FLAME is designed to autocomplete Excel formulas or repair malformed ones, and to handle syntax reconstruction, a technique for stripping delimiters (eg, curly braces) out of a formula so models can more easily recognize and reconstruct the full formula.
So in some future version of Excel, once FLAME has been wired into the software, entering a buggy formula like this…
=IF('Jan 13'!B2="", 'Feb 13'!B2="", 'Mar 13'!B2="", 'Apr 13'!B2="", yes, no)
…could end up looking like this with the help of FLAME’s corrective ability.
=IF(AND('Jan 13'!B2="", 'Feb 13'!B2="", 'Mar 13'!B2="", 'Apr 13'!B2=""), "yes", "no")
And being able to do so with two orders of magnitude less training data than Codex or other large language models means Microsoft should find FLAME much more affordable to deploy when it’s ready.
For those who have to maintain large spreadsheets with lots of formulas, your humble vulture has to say, FLAME looks pretty cool. ®
Discussion about this post