Results of the Competition on Witness Validation

This web page presents the results of SV-COMP 2025 - 14th International Competition on Software Verification

Competition Report of SV-COMP 2025
The benchmarks and rules are also available from the competition website.
Results of the Verification Track are also available.

The background color is gold for the winner, silver for the second, bronze for third and light green for demo categories.

Here some brief directions for reading the score-based quantile plots:

The right end of the graph displays the achieved score. The score is SV-COMP's definition of quality. You can read the ranking on the top from right to left.
The left end of the graph displays the amount of wrong results that a verifier produced. Left-most start means worst.
The length of the graph displays the amount of correct verification work. Long is good.
More about what you can learn from a score-based quantile plot and how to interpret it, is described in the SV-COMP 2015 report, on pages 12 and 13.

1. Validation of Correctness Witnesses (Version 2.0)

2. Validation of Violation Witnesses (Version 2.0)

3. Validation of Correctness Witnesses (Version 1.0)

4. Validation of Violation Witnesses (Version 1.0)

1. Validation of Correctness Witnesses (Version 2.0)

Ranking by Category (with Score-Based Quantile Plots)

ReachSafety 1. UAutomizer 2. CPAchecker 3. LIV

NoOverflows 1. UAutomizer 2. CPAchecker 3. Mopsa

SoftwareSystems 1. Mopsa 2. CPAchecker 3. MetaVal

ValidationCrafted 1. UAutomizer 2. CPAchecker 3. LIV

Overall 1. UAutomizer 2. CPAchecker 3. Mopsa

Page 1 / 2

Table of All Results

In every table cell for competition results, we list the points in the first row and the CPU time (rounded to two significant digits) for successful runs in the second row.

The entry '–' means that the competition candidate opted-out in the category.

The definition of the scoring schema and the categories is given on the respective SV-COMP web pages.

Note on meta-categories: The score is not the sum of scores of the sub-categories (normalization). The run time is the sum of run times of the sub-categories, rounded to two significant digits.

Hide base categories

Filter tools by language

Participants	Plots	CPAchecker	Goblint	LIV	MetaVal	MetaVal++	Mopsa	UAutomizer	UReferee
Representing Jury Member		Marian Lingsch-Rosenfeld	Simmo Saan	Marian Lingsch-Rosenfeld	Marian Lingsch-Rosenfeld	Marian Lingsch-Rosenfeld	Raphaël Monat	Marcel Ebbinghaus	Frank Schüssele
Affiliation		LMU Munich, Germany	University of Tartu, Estonia	LMU Munich, Germany	LMU Munich, Germany	LMU Munich, Germany	Inria and University of Lille, France	University of Freiburg, Germany	University of Freiburg, Germany

2. Validation of Violation Witnesses (Version 2.0)

Ranking by Category (with Score-Based Quantile Plots)

ReachSafety 1. CPAchecker 2. Witch 3. MetaVal

MemSafety 1. UAutomizer 2. Witch 3. CPAchecker

NoOverflows 1. UAutomizer 2. CPAchecker 3. Witch

SoftwareSystems 1. CPAchecker 2. MetaVal 3. Witch

ValidationCrafted 1. Witch 2. MetaVal 3. UAutomizer

Overall 1. Witch 2. MetaVal 3. CPAchecker

Page 1 / 2

Table of All Results

In every table cell for competition results, we list the points in the first row and the CPU time (rounded to two significant digits) for successful runs in the second row.

The entry '–' means that the competition candidate opted-out in the category.

The definition of the scoring schema and the categories is given on the respective SV-COMP web pages.

Note on meta-categories: The score is not the sum of scores of the sub-categories (normalization). The run time is the sum of run times of the sub-categories, rounded to two significant digits.

Hide base categories

Filter tools by language

Participants	Plots	CPAchecker	MetaVal	UAutomizer	Witch
Representing Jury Member		Marian Lingsch-Rosenfeld	Marian Lingsch-Rosenfeld	Marcel Ebbinghaus	Paulína Ayaziová
Affiliation		LMU Munich, Germany	LMU Munich, Germany	University of Freiburg, Germany	Masaryk University, Brno, Czechia

3. Validation of Correctness Witnesses (Version 1.0)

Ranking by Category (with Score-Based Quantile Plots)

ReachSafety 1. UAutomizer 2. MetaVal 3. CPAchecker

NoOverflows 1. UAutomizer 2. CPAchecker 3. LIV

SoftwareSystems 1. UAutomizer 2. MetaVal 3. UReferee

Overall 1. UAutomizer 2. CPAchecker 3. MetaVal

Page 1 / 2

Table of All Results

In every table cell for competition results, we list the points in the first row and the CPU time (rounded to two significant digits) for successful runs in the second row.

The entry '–' means that the competition candidate opted-out in the category.

The definition of the scoring schema and the categories is given on the respective SV-COMP web pages.

Note on meta-categories: The score is not the sum of scores of the sub-categories (normalization). The run time is the sum of run times of the sub-categories, rounded to two significant digits.

Hide base categories

Filter tools by language

Participants	Plots	CPAchecker	JCWIT	LIV	MetaVal	UAutomizer	UReferee
Representing Jury Member		Marian Lingsch-Rosenfeld	inactive	Marian Lingsch-Rosenfeld	Marian Lingsch-Rosenfeld	Marcel Ebbinghaus	Frank Schüssele
Affiliation		LMU Munich, Germany	--,--	LMU Munich, Germany	LMU Munich, Germany	University of Freiburg, Germany	University of Freiburg, Germany

4. Validation of Violation Witnesses (Version 1.0)

Ranking by Category (with Score-Based Quantile Plots)

ReachSafety 1. ConcurrentWitness2Test 2. Symbiotic-Witch 3. CPAchecker

MemSafety 1. CPAchecker 2. MetaVal 3. Symbiotic-Witch

ConcurrencySafety 1. Dartagnan 2. CPAchecker 3. UAutomizer

NoOverflows 1. Symbiotic-Witch 2. MetaVal 3. UAutomizer

Termination 1. CPAchecker 2. UAutomizer 3. MetaVal

SoftwareSystems 1. CPAchecker 2. Symbiotic-Witch 3. MetaVal

Overall 1. CPAchecker 2. UAutomizer 3. -

Page 1 / 3

Table of All Results

In every table cell for competition results, we list the points in the first row and the CPU time (rounded to two significant digits) for successful runs in the second row.

The entry '–' means that the competition candidate opted-out in the category.

The definition of the scoring schema and the categories is given on the respective SV-COMP web pages.

Note on meta-categories: The score is not the sum of scores of the sub-categories (normalization). The run time is the sum of run times of the sub-categories, rounded to two significant digits.

Hide base categories

Filter tools by language

Participants	Plots	ConcurrentWitness2Test	CPA-witness2test	CPAchecker	Dartagnan	CProver-witness2test	GWIT	MetaVal	NITWIT	Symbiotic-Witch	UAutomizer	Wit4Java
Representing Jury Member		Zsófia Ádám	inactive	Marian Lingsch-Rosenfeld	Hernán Ponce de León	inactive	inactive	Marian Lingsch-Rosenfeld	inactive	Paulína Ayaziová	Marcel Ebbinghaus	Tong Wu
Affiliation		Budapest University of Technology and Economics, Hungary	--,--	LMU Munich, Germany	Huawei Dresden Research Center, Germany	--,--	--,--	LMU Munich, Germany	--,--	Masaryk University, Brno, Czechia	University of Freiburg, Germany	University of Manchester, UK

Results of the Competition on Witness Validation

Contents

1. Validation of Correctness Witnesses (Version 2.0)

Ranking by Category (with Score-Based Quantile Plots)

Table of All Results

2. Validation of Violation Witnesses (Version 2.0)

Ranking by Category (with Score-Based Quantile Plots)

Table of All Results

3. Validation of Correctness Witnesses (Version 1.0)

Ranking by Category (with Score-Based Quantile Plots)

Table of All Results

4. Validation of Violation Witnesses (Version 1.0)

Ranking by Category (with Score-Based Quantile Plots)

Table of All Results