← back to legends
DM
machine learning pioneer

Donald Michie

1923–2007

British researcher who worked alongside Turing at Bletchley and went on to pioneer reinforcement learning with MENACE — a matchbox-and-bead system that learned to play tic-tac-toe.

key contributions
  • MENACE (1961) — a physical reinforcement-learning machine made of matchboxes.
  • Founded the Department of Machine Intelligence at Edinburgh.

FAQ

What is MENACE?

Michie's 1961 reinforcement-learning machine made of 304 matchboxes filled with coloured beads. Each box represented a tic-tac-toe position, and beads encoded moves; reward and punishment were applied by adding or removing beads after each game.

What did Donald Michie do at Bletchley Park?

He joined the Government Code and Cypher School at 18 and worked alongside Turing and Max Newman on cryptanalysis, including the Tunny project that broke the German Lorenz cipher used for high-command traffic.

What is reinforcement learning?

A machine-learning paradigm in which an agent learns by taking actions and receiving rewards. MENACE was a physical instance of it in 1961; the modern resurgence powers AlphaGo, ChatGPT's RLHF post-training, and most robotics policies.

What is Michie's relationship to Turing?

They were colleagues at Bletchley during the war. Michie later said that long conversations with Turing about whether machines could learn shaped his entire career, and his founding of the Edinburgh Department of Machine Intelligence was a direct continuation of that programme.