This project examines the potential of machine learning and artificial intelligence (AI) to support nuclear safeguards and arms-control verification. We propose a network of autonomous mobile robots (“inspector bots”) equipped with directional neutron detectors to carry out selected inspection tasks in different types of facilities. Inspector bots could be more effective and, if designed appropriately, less intrusive than human inspectors.
In designing the system, we use the multi-armed bandit framework for decision-making under uncertainty. Originally referring to a gambler’s choices among slot machine levers, the multi-armed bandit has become the model problem in machine learning and AI for designing optimal selection strategies for the widespread problem of choosing among reward-bearing options when rewards are uncertain. Optimal strategies address the key tradeoff between selecting (“exploiting”) well-sampled options expected to reap high reward versus selecting (“exploring”) poorly sampled options for possibly even higher reward. When the multi-armed bandit problem has real-world constraints, e.g. when it only allows for a maximum time played or a maximum distance travelled, it becomes a richly challenging optimization problem and an important use-case for machine learning and AI.