A Decathalon of Difficulty: Working Group Benchmarks the Limits of Quantum Optimization

Insider Brief:
- The QOBLIB benchmark introduces ten combinatorial optimization problem classes that are empirically difficult and structurally relevant, enabling fair comparisons across quantum and classical solvers.
- It emphasizes model-independent benchmarking, offering both MIP and QUBO formulations with standardized metrics for reproducibility and performance tracking.
- The benchmark supports heuristic approaches like QAOA and includes illustrative quantum baselines to facilitate community engagement and progress monitoring.
- Designed as an open-source initiative, QOBLIB invites researchers to test, submit, and refine solutions, with the goal of identifying real paths toward quantum advantage.
The push toward quantum advantage in combinatorial optimization has been held back not only by hardware limitations, but by a notable absence of agreed-upon benchmarks that reflect real-world difficulty. In a new arXiv preprint, a team of researchers from IBM Quantum, Zuse Institute Berlin, Kipu Quantum, and several universities and companies introduces a standardized benchmark suite for quantum optimization. Dubbed the Quantum Optimization Benchmark Library, or QOBLIB for short, the suite includes ten problem classes selected for their empirical hardness and practical relevance, ranging from multi-period portfolio optimization to minimum Birkhoff decompositions.
As Jay Gambetta noted in a recent post, optimization problems continue to generate both excitement and debate within the quantum community—not because quantum computers are expected to solve them efficiently in general, but because quantum heuristic approaches like QAOA might prove competitive for certain instances. “It is definitively worth the research effort to look for candidates that quantum can do better than classical,” he wrote, adding that this makes them “a viable path to quantum advantage.” The QOBLIB paper aligns with this view, providing a comprehensive overview of exactly those kinds of instances and to offer a platform for tracking progress across both quantum and classical solvers.
From Theoretical Promises to Testable Problems
Quantum algorithms have been actively explored for intractable optimization problems, particularly those classified as NP-hard. However, most proposed algorithms are heuristics with no guarantees of optimality. As a result, empirical performance becomes the primary evaluation method. According to the authors, benchmarking should be model-independent to fairly compare quantum and classical approaches, and track progress in hardware and algorithm design over time.
The paper presents a respectably clear benchmarking philosophy: benchmarks should be difficult yet possible, rooted in real-world or structurally relevant problems, and flexible in terms of modeling and solver type. The ten chosen problem classes all become challenging for classical solvers at variable counts ranging from under 100 to approximately 100,000, which the authors argue brings them within reach of current and near-term quantum hardware.
The ten problem classes fall into three categories. First, “classical” binary optimization problems such as Market Split, Maximum Independent Set, and Network Design are included due to their long history and well-understood hardness, making them suitable for comparing new approaches. Second, problems like Low Autocorrelation Binary Sequences, Minimum Birkhoff Decomposition, and Sports Tournament Scheduling are less commonly benchmarked but offer hard instances even at small sizes. Finally, practically motivated problems such as Portfolio Optimization and Capacitated Vehicle Routing are included despite their modeling variability, reflecting their importance for industrial applications.
Each problem class is presented with clear instance generation methods and baseline classical results, typically using solvers like Gurobi and CPLEX for ILP models and ABS2 or other specialized libraries for QUBO variants. Several problems, such as Market Split and LABS, become intractable for classical solvers at surprisingly small scales.
Modeling Considerations and Reproducibility
Most quantum optimization algorithms, including those such as QAOA and quantum annealing, require problems to be framed as quadratic unconstrained binary optimization or equivalent Ising models. However, translating problems into QUBO form is more complex than it sounds. The authors highlight that some transformations often result in densely connected variables and large coefficient ranges, which can overwhelm both classical and quantum solvers.
Rather than prescribing a single modeling pathway, QOBLIB supports model-independent benchmarking. Instances are provided in both MIP and QUBO form when feasible, with accompanying information on variable count, sparsity, and coefficient range. For example, while a LABS problem might involve fewer than 100 binary variables in its MIP formulation, its QUBO equivalent can require over 800 variables and significantly higher-order terms.
To ensure reproducibility, the authors define standardized metrics for both solution quality and runtime. For stochastic algorithms—including many quantum solvers—the benchmark encourages reporting across multiple runs, using success rate and time-to-solution as core metrics. Importantly, quantum runtime is carefully defined to exclude queuing time and include only the stages of circuit preparation, execution, and measurement, aligning with session-based operation on platforms like IBM Quantum.
For optimization problems, the key outcome is the best possible objective value found, not necessarily the global optimum. This reflects the heuristic nature of most quantum approaches and focuses the benchmark on practical performance rather than theoretical completeness.
Initial Results and Community Call
While the primary goal of the paper is to establish the benchmarking framework, not to declare superiority of any one algorithm, the authors include illustrative quantum baselines on select problem classes. In particular, examples are given for LABS, Birkhoff Decomposition, and Maximum Independent Set. These baselines are not intended to represent state-of-the-art quantum performance but to show how solutions can be structured and submitted.
Ultimately, the authors position QOBLIB as a community resource. The benchmark repository is open-source, and contributions from researchers testing classical, hybrid, or quantum approaches are encouraged. By standardizing problem definitions, metrics, and evaluation practices, the team aims to enable systematic tracking of progress in quantum optimization—and perhaps, over time, evidence of quantum advantage that extends beyond toy problems and synthetic examples.
Toward a Meaningful Benchmarking Standard
Unlike many earlier benchmarks focused on artificially structured or trivially solvable instances, QOBLIB places difficulty front and center. But it does so without drifting into pathological territory, choosing problems that are hard not because they are contrived, but because they mirror the structure and complexity of real-world optimization challenges. The result is a comprehensive, practical, and adaptable benchmark—a kind of optimization decathlon—that may serve as a proving ground for the next wave of quantum solvers.
Contributing authors on the paper include Thorsten Koch, David E. Bernal Neira, Ying Chen, Giorgio Cortiana,
Daniel J. Egger, Raoul Heese, Narendra N. Hegade, Alejandro Gomez Cadavid, Rhea Huang, Toshinari Itoko, Thomas Kleinert, Pedro Maciel Xavier, Naeimeh Mohseni, Jhon A. Montanez-Barrera, Koji Nakano, Giacomo Nannicini, Corey O’Meara, Justin Pauckert, Manuel Proissl, Anurag Ramesh, Maximilian Schicker, Noriaki Shimada, Mitsuharu Takeori, Victor Valls, David Van Bulck, Stefan Woerner, and Christa Zoufal.