The design of industrial processes

When organic chemists identify a useful chemical compound — a new drug, for instance — it’s up to chemical engineers to determine how to mass-produce it.
There could be 100 different sequences of reactions that yield the same end product. But some of them use cheaper reagents and lower temperatures than others, and perhaps most importantly, some are much easier to run continuously, with technicians occasionally topping up reagents in different reaction chambers.
Historically, determining the most efficient and cost-effective way to produce a given molecule has been as much art as science. But MIT researchers are trying to put this process on a more secure empirical footing, with a computer system that’s trained on thousands of examples of experimental reactions and that learns to predict what a reaction’s major products will be.
The researchers’ work appears in the American Chemical Society’s journal Central Science. Like all machine-learning systems, theirs presents its results in terms of probabilities. In tests, the system was able to predict a reaction’s major product 72 percent of the time; 87 percent of the time, it ranked the major product among its three most likely results.
“There’s clearly a lot understood about reactions today,” says Klavs Jensen, the Warren K. Lewis Professor of Chemical Engineering at MIT and one of four senior authors on the paper, “but it’s a highly evolved, acquired skill to look at a molecule and decide how you’re going to synthesize it from starting materials.”
With the new work, Jensen says, “the vision is that you’ll be able to walk up to a system and say, ‘I want to make this molecule.’ The software will tell you the route you should make it from, and the machine will make it.”
With a 72 percent chance of identifying a reaction’s chief product, the system is not yet ready to anchor the type of completely automated chemical synthesis that Jensen envisions. But it could help chemical engineers more quickly converge on the best sequence of reactions — and possibly suggest sequences that they might not otherwise have investigated.