How the SAT Competition 2003 will proceed?

SAT Competition 2003

February - May, 2003

This is subject to change. Last modification: December 4, 2002.

A new SAT competition is going to start on February 2003 and continue until May 2003. The results of the competition will be given at the SAT 2003 conference (Italy). That competition follows the SAT Competition 2002 started on March 10, 2002 and continued until (and during) SAT 2002 Symposium (May 6-9, 2002, Cincinatti, Ohio, USA).

Because a competition needs strict winners, according to the competition rules, this page describe how we will find them. Of course, this is only related to the rules we have defined. This page describes this rules and how we'll choose the "best" SAT solvers (3 awards will be given) for this competition, and what benchmarks will be the hardest (according to our rules). For further details, please take a look at the technical report concerning the SAT 2002 competition.

Competition calendar

Month	Week	N°	Competition phases
February	3 to 9	0
February	10 to 16	1	Submission of Benchmarks/Solvers (Feb,14th)
February	17 to 23	2	Compliance testing
February/March	24 to 2	3	Buggy solvers returned (Feb, 28th)
March	3 to 9	4	Buggy solvers resubmitted (March, 7th)
March	10 to 16	5	First stage starts (March, 14th)
March	17 to 23	6	first stage
March	24 to 30	7	first stage
March/April	31 to 6	8	first stage
April	7 to 13	9	first stage
April	14 to 20	10	Second stage starts (April, 14th)
April	21 to 27	11	second stage
April/May	28 to 4	12	second stage
May	5 to 11	13	Awards revealed (during conference)

As shown on the above calendar, the competition will be 13 weeks long, and separated in 3 rounds:

The first one (in red/white, weeks 1-4), is devoted to compliance testing of submissions.
The second one (green, weeks 6-9) is the first stage of the competition, were each solver will be tried on the same set (or a subset) of submitted benchmarks.
The third one (blue, weeks 10-12) is the second stage of the competition. The TOP-N solvers in each category of benchmarks will be tried on the hardest benchmarks in each category (each category beeing partitioned into SAT/UNSAT instances).

Details of each round

First round (red/white, weeks 1-4): before the actual competition starts, solvers and benchmarks will be checked for compliance (essentially input/output). For each submitter, an account will be opened on a Linux machine over the web, where compilation may be checked. Please do notice that before week 5, no submitted solvers will be tested on any submitted (new) benchmark.

Solvers: after the first deadline (week 1), solvers will be tested for bugs (please read carefully what we call a buggy solver) on previous (not submitted) benchmarks. Buggy solvers will be then returned to their authors (week 3), who will take their chances to fix the bugs. After possible re-submission (week 4), the solvers are checked again, the still-buggy solvers are excluded, and the actual competition starts (while if a solver is found buggy even at a later stage, it is excluded anyway). Please read carefully the call for solvers for more details.
Benchmarks: as described in the call for benchmarks, there will be three categories of benchmarks (Industrial, HandMade and Random). Because of intrinsic limits of incomplete solvers, the distinction will be made between SAT and UNSAT instances (submitters will have to precise, for each benchmark or generator, if it is SAT, UNSAT or UNKNOWN). During weeks 1-4, Industrial (pre-generated) benchmarks will be checked for compliance (please note that our CNF format is very restrictive). For handmade (crafted) and random ones, on which generators are required, weeks 1-4 will also be used to generate benchmarks, according to their hardness with previous solvers.

Second round, first stage (green, weeks 5-9): this is the end of the interactive part of the competition. We will run each solver on each applicable benchmark (i.e., we will not run incomplete solvers on the benchmarks known to be unsatisfiable). If a solver does not answer within time limit (to be determined when we know the number of submissions), it is interrupted. Randomized solvers will be started several times (at least 3 times) and the median time will be taken into account ("time limit" is charged for the unsuccessful tries). Time limit will be fixed after week 1, when the number of submission will be known (this number will be made public).

Third round, second stage (green, weeks 5-9): for this last stage of the competition we select top N solvers (determined from the first stage by the same method as will be used for determining the winners after the second stage, considering only the best variant of a given solver) in each of the 6 categories and run them on the M smallest unsolved benchmarks (again, separately in each category), smaller benchmarks come earlier. We suppose N= M =4 now, but Laurent Simon may change it right after the deadline if the number of benchmarks and/or solvers and/or our computers will be far from what we predict. Clearly, at this stage the time limit will be substantially larger.

Results (yellow, week 13): awards will be given during the SAT2003 conference. We will also try to give a first analysis of results (see below for awards).

Who are the winners?

There will be 8 prizes. We will grant 6 winners for the solvers and 2 for the benchmarks:

Solvers (6 awards): there will be two categories of solvers (complete ones and incomplete ones). Clearly, a complete solver can take part in the competition in both categories. Because there is three categories of benchmarks, there will be a total of 2*3=6 winners, the same amount of second place winners, etc. The rules we'll use for ranking solvers are:
- For a given category, a winning solver is determined by the total number of "hard" (second stage) benchmarks it was able to solve. If this figure is the same for several solvers, then we look into the first stage results.
- First, the total number of solved series of benchmarks is counted (we count here series as "solved" if at least one benchmark is solved).
- If this figure is the same again, then the total number of solved benchmarks is counted.
- If even this figure is the same, the total elapsed CPU time is counted.
All solvers that fall within 1 sec. * number of benchmarks of the winner are also declared winners.
However, if a solver solves no (zero) benchmarks for given category, it cannot be declared a winner under any circumstances. Furthermore, if the winner in a category was already awarded for that category in the previous competition, then there will be no award given for that category.
Benchmarks (2 awards): we will award the authors of the smallest benchmarks (in terms of the total number of literal occurrences) that were not solved by any solver will be declared winners. There will be two winners: one for a satisfiable benchmark, and one for an unsatisfiable one. We do not award benchmarks submitted as "unknown" ones. The following benchmarks will be excluded from the consideration here:
- Benchmarks based on formulas described in scientific papers (or other publicly available sources) not authored by the person submitting these benchmarks, if the description was aimed at presenting hard formulas for algorithms or proof systems for (UN)SAT or another (co-)NP-hard problem.
- Benchmarks used in the previous SAT competitions (DIMACS, Bejing, Paderborn, SAT2002) or benchmarks too similar to these.
- Benchmarks that are publicly available and known to be solved by at least one solver (even not participating in the competition).

What are the awards?

A beautiful certificate, unless some people are willing to provide prizes to motivate the competition (then please contact the organizers).

What are the differences with the previous (2002) competition?

According to some important bias in the SAT 2002 competition (we identifed them in our SAT 2002 competition report), we made the following decisions:

No heuristics will be used to prevent useless launch (instead of preventing Uncomplete solvers to be launch on Known-UNSAT instances). Especially, no heuristic will be based on the hypothetical relative hardness of benchmarks of a given family.
Only one variant of a solver (the best one during the first stage) will be allowed in the second stage (round 3) of the contest.
A special session with a time limit given for solving a sequence of benchmarks may be organized, depending on the number of people willing to participate to that session, and provided that we get a satisfying automated framework to do it.
A board of 3 judges will be appointed to check that the competition rules are respected and take decisions if a problem occurs.

Important dates

February, 14: deadline for submitting solvers.
February, 21 (UPDATED): deadline for submitting benchmarks.
February, 28: buggy solvers returned to the authors.
March, 7: buggy solvers resubmitted.
March, 14: 1st stage of the competition starts.
April, 14 (or earlier): 2nd stage of the competition starts.
May: results during SAT2003 conference, awards.

[Top] [SAT 2002 competition] [SAT 2003] [SAT-Ex] [SATLIB] [SAT Live!]

Last modification: 3 December 2002.