Using Forecasting Methodologies to Look to the Future of the North Korean Nuclear Program

By: Hailey Wingo, Grant Christopher and Sarah Laderman
December 12, 2024
Commentary, WMD

How can we improve our understanding of the future of the North Korean nuclear program? By using a methodology known as forecasting, we may be able to identify the most likely trajectories that nuclear programs may follow. Forecasting is an analytical skill that can be learned and improved upon; it requires forecasters to provide precise statistical boundaries on the chance of clearly defined future events.[1]

As a predictive technique, forecasting has been embraced by sectors ranging from financial institutions to the US intelligence community, and it may be a useful addition to the analytical toolbox of North Korea watchers. It can help to address uncertainties regarding future developments by breaking them apart and interrogating underlying assumptions. In fact, this methodology has already been applied to better understand the Democratic People’s Republic of Korea’s (DPRK or North Korea) nuclear program, a particularly difficult challenge for conventional analysis due to the opacity of information. In February 2024, the Open Nuclear Network (ONN), the Verification Research, Training and Information Centre (VERTIC), and the James Martin Center for Nonproliferation Studies (CNS) held a workshop to forecast North Korean nuclear futures with the help of the Swift Centre, a team of forecasting facilitators founded by a former director at the Good Judgement Project.

VERTIC Researcher Hailey Wingo interviewed the organizers, Dr. Grant Christopher (Co-Director of the Verification and Monitoring Program, VERTIC), and Sarah Laderman (Senior Analyst, ONN) to extract their insights from running this workshop on whether forecasting is a useful tool for the North Korea watcher community.

Hailey Wingo (HW): What is forecasting, and how could it be relevant for the DPRK watcher community?

Grant Christopher (GC): Forecasting is usually explained as the “wisdom of the crowd”: how an analytical and open-minded group can outperform subject-matter experts when predicting things. A classic example of the wisdom of crowds is the fairground game guessing the number of jellybeans in a jar, where averaging over the guesses of many participants can get close to the real answer (with potentially a complete absence of volumetric experts). For forecasting complex political and technical questions, forecasters, and in particular a well-structured forecasting team, can be shown to outperform subject-matter experts on predictions for complex questions.

We should be skeptical that forecasting can improve on expert analysis in every instance. On the other hand, we should understand why and under which circumstances it could add value. Experts have a more detailed understanding of the dynamics, history and underlying factors of a subject than forecasters, and expert analyses can be crucial inputs for successful forecasting. However, experts may also have a predetermined belief, based on their expertise and a theoretical structure, on how the future will unfold and even a thesis they wish to defend against alternatives. Successful forecasters should not be wedded to a school of thought and should be prepared to update their forecasts when new data is available. In addition, a team of forecasters can improve over that of an individual by interrogating the underlying reasoning. Accurate forecasting requires good data produced by expert analysis, and it is not entirely auditable—forecasters can provide explanations and reasoning, but wisdom of the crowd-type predictions do not allow averaging of reasoning.

Sarah Laderman (SL): Forecasting, when combined with other analytical methods, is a powerful tool that can be used to think about a number of challenging issue areas. As there is comparatively little reliable information in open sources about how North Korea structures its nuclear program, it is especially useful in order to generate new, insightful analysis and approach the dearth of information from a new viewpoint. By thinking about problems from different angles and questioning assumptions and baseline knowledge, we can better understand different potential futures and their consequences.

While forecasting is not a methodological panacea, it is an extremely effective tool to force some of these discussions that experts do not typically have regarding the assumptions and logic behind their analyses. Further, it can provide a new way of looking at the same data. This field suffers from a lack of new information, so analysts need to be creative in their thinking in order to not fall into bias traps.

GC: Forecasting is not a silver bullet for predicting how the situation on the Korean Peninsula will unfold in the next few decades or the role of the nuclear program. However, it is a helpful tool when participants have a clear sense of both the purpose of forecasting and its limitations and can provide valuable insights.

Forecasting must be structured using answerable questions related to a fixed date; it is good for questions like “Will North Korea conduct another nuclear test before March 2025?” but cannot directly address questions such as “What will be the role of the WMD nuclear program in 2035?” In the latter case, forecasting could develop scenarios to understand the context for decisions about the nuclear program.

HW: What does pursuing a forecasting process look like, and what are possible starting points for analysis?

GC: As we said, forecasting cannot directly address complex questions, such as how the North Korean nuclear program will evolve. A forecasting question must have a date at which the question is answered and constructed so that it can be objectively checked, as well as framing questions with concrete, measurable answers. For example, one could approach this by asking “How many nuclear weapons will North Korea have by February 2029?”[2] Others who explore the use of forecasting methodologies may wish to consider other core questions, which have a quantitative or binary answer but are good indicators of broader changes.

Responses to an initial question are used for calibration, and then forecasters are asked about how their answers would change if certain factors changed. These factors may include particular events: internal and external factors related to the nuclear program, political leadership, broader political dynamics and “acts of God” such as pandemics, nuclear accidents, and succession crises from unexpected changes in leadership. They may also include the emergence of new information from defector accounts, declassification of intelligence reports, advances in open-source technology, or disclosures from the North Korean regime or its allies.

A workshop is a tried-and-tested format when you want subject matter experts to interact with each other directly in developing and evaluating these factors. Forecasting adds an additional layer. With the use of a voting system, including displaying real-time results, it leverages the expertise of everybody in the room, not just whoever has the floor. On any particular question, it facilitates interaction between experts on diverse subject matters and generalists. Without forecasting, experts could brainstorm factors, and then discuss which should be considered most important to the trajectory of North Korea’s nuclear program. But I have no doubt that asking participants to forecast how a factor would actually impact the baseline and then discussing and revising results in a more rigorous outcome than a traditional roundtable format.

It is not necessary to recruit a team of “superforecasters”; you can use a more typical workshop audience, composed of both generalists and North Korean-focused technical and political experts with diverse genders, ages and backgrounds. However, it is essential to prime such a group with training before the forecasting sessions.[3]

HW: We ran a workshop in February 2024 using the forecasting methodology. How was it used to drive the workshop?

SL: When structuring the workshop, we really wanted a free-flowing discussion on day one with more than just presentations from organizers to attendees. We put together a group of participants with vastly different areas of expertise (technical vs. political, generalists vs. North Korea experts) and varied genders, ages and backgrounds. In the spirit of discussion and collaboration, we asked each participant to prepare remarks related to their areas of expertise, which kicked off more in-depth project presentations and subsequent sharing of ideas.

From these general conversations, we undertook some brainstorming exercises that funneled down into more specific areas—multiple independently targetable reentry vehicle (MIRV) capabilities, gas boosting, and experimental light-water reactor (ELWR) operation—that were ripe for exploration during the forecasting work on the second day. We were able to get a lot of good discussion out of day one simply because we asked people to get out of their seats, put away their laptops, physically write on paper, and present their own ideas, all of which rarely happens at workshops.

While ONN and the project consortium could have made some educated assumptions about what might happen in the future based on our own expertise, we wanted to have insight into how a broader set of experts viewed the future of the program. This was the ideal opportunity to introduce forecasting to those who might be unfamiliar with the method and utilize its framework to gain actionable insight into an uncertain issue.

GC: On the first day, we discussed the forces driving North Korea’s nuclear program, and brainstormed factors that could make a significant impact on its trajectory. During the second day, we used these factors to forecast the impact of selected events on North Korea’s nuclear program.

Our aim was not to use forecasting to quantify the future size of the North Korean nuclear program but to apply the group’s judgment in identifying key factors that would impact the trajectory of the nuclear program in the medium term. These results could then be used to inform our fuel cycle simulation for the next decade.

As a proxy for the North Korean nuclear program, we estimated the number of warheads it will have in its arsenal by February 2029 (five years after the workshop). We are not sharing this number because the goal was not to use the wisdom of the crowd to provide this. We took this number as a baseline and were interested in how any particular factor would change this.

Participants had to quantify each factor’s likelihood (percent chance of occurring), impact (what would be the revised number of warheads, above or below the baseline), and also a confidence level, effectively providing a probability distribution function. This occurred through a voting system that displayed live results, showing the sum of all inputs from participants. After a first round of voting, participants, particularly those on the extremes of predictions, would explain their choices to the workshop adapting a Delphi methodology, allowing them to revise their choices after group discussion.[4]

Asking people to discuss their answers and vote again introduces a risk of conformity bias. To mitigate this, our discussion included both a roundtable and an anonymous chat function that were both used by the facilitator. An alternative would be to have only a single round of voting that is not influenced by discussion and does not draw out the knowledge of assembled experts, which would provide poorer results. Without discussion, there is also the risk of retaining systematic biases (i.e., an incorrect or unsupportable assumption) that could be dispelled or countered by knowledgeable input.

HW: What were some of the challenges in forecasting and interpreting results during the workshop?

GC: One of the issues we encountered was a healthy debate at the beginning of the second day about how to define the overarching question: How do you encapsulate the trajectory of the North Korean nuclear program in a single question? It will, of course, require a simplification, but in the end, we settled upon the number of warheads in the arsenal as a general indicator of the program’s trajectory.

Ideally, we could also repeat this workshop with another group of experts and see if or how the results differ. At 18 participants, the group in this workshop was the perfect size for the facilitators—any bigger, and we would have had to divide into smaller groups—but it did mean that there was often only one person per area of expertise. With a bigger group or repeated trials, subject matter experts may be able to delve even deeper into debates during the rounds of discussion while refining predictions.

Another issue was ensuring buy-in on forecasting. We informed participants far ahead of time what the aims and structure of the workshop would be and provided preparatory material. We had thought carefully about inviting participants that we thought would engage with the format, but I think most had some uncertainty about it until we were in the room. Using the wisdom of crowds, where you cannot average reasoning for a complex technical, political problem, also unnerved some participants. However, there was a great deal of reassurance when it was clear that forecasting was a tool to obtain key drivers that would impact the nuclear program.

SL: In addition to those issues, one of the largest challenges was simply time. Covering the entirety of North Korea’s nuclear fuel cycle and weapons program in just two days was ambitious, so we were only able to cover a select number of topics. In the future, I would love to see a series of workshops on specific issue areas related to the program, with each area getting its own two days of discussion and forecasting.

HW: How was forecasting able to provide you with insight on the direction of the North Korean nuclear program?

GC: At the point when we began developing this workshop, VERTIC had already modeled North Korea’s nuclear fuel cycle from 1986 until 2023, and had produced estimates of its plutonium and highly enriched uranium stockpiles (HEU) as part of a project with CNS and the Royal United Services Institute (RUSI), funded by Global Affairs Canada. To model the North Korean program, we used open sources, including satellite imagery, International Atomic Energy Agency (IAEA) reports, US Government assessments and other nongovernmental analyses to estimate the size of their program. Although we can learn a great deal from those sources, we cannot completely constrain what North Korea is doing, particularly for HEU production.

SL: To complement VERTIC’s work in forecasting North Korea’s fuel cycle, ONN sought to continue its nuclear arsenal modeling out for the next decade; however, this was also stymied by the lack of open-source information relevant for future projections.

I believe that by incorporating forecasting work into what could have been a more conventional workshop, the participants and organizers were forced to lay out their assumptions and innate biases. The examination of our forecasts allowed for spirited debate about how everyone arrived at their predictions. I was pleasantly surprised to find not only my ideas reshaping, but also other experts walking away with newly shifted perspectives, especially on some poorly understood technical nuances that affect political analyses and vice versa.

A public version of the workshop report is available on ONN’s website. The workshop outcomes are also included in other ONN and VERTIC analyses.

[1]
The forerunner of the current practice of forecasting is an initiative called the Good Judgement Project, which is based on psychological research into prediction and a research project by the Intelligence Advanced Research Projects Activity (IARPA), a branch of the United States intelligence community. The Good Judgement Project went on to pioneer the concept of “superforecasters,” a group of people who are able to consistently make accurate predictions on a variety of topics. Compared to the general public, superforecasters tend to score highly on open-mindedness and analytical thinking. While becoming a superforecaster takes practice, there are some who are leveraging the principles of accurate forecasting in order to facilitate for those who are unfamiliar with the practice.
[2]
We used this question as a proxy for the status of the nuclear program after much internal debate about developing more complicated questions that accounted for important factors such as delivery systems, differing yields and numbers of strategic, regional and tactical weapons.
[3]
We would advise recruiting a forecasting specialist, such as the Swift Centre or the Good Judgement Project, to both provide this training and facilitate the forecasting sessions.
[4]
Participants in a traditional workshop would be able to express opinions and explanations. Through this voting system, participants had to quantify their answers, but had a chance to revise their choices based on the discussion and draw out their underlying reasoning.

Related Articles

Does North Korea Really Have So Few ICBMs?

Yongbyon Nuclear Scientific Research Center: A Suspected New Enrichment Facility and Dismantlement of 50 MWe Reactor

Sparrow Returns to the Peninsula: Observations on North Korea’s May 15 Air Drills