REPCAN-v2: A Polynomial-Time Candidate for Graph Isomorphism
A 6-step PRIMA research program produced a 130 KB IEEE-conference-style paper proposing a new canonical-form algorithm with three rigorous theorems and five formally-stated conjectures.
§At a Glance
The Graph Isomorphism (GI) problem occupies one of the most remarkable positions in computational complexity: it is known to be in NP but has resisted decades of effort to prove either NP-complete or in P. László Babai's 2015 quasipolynomial-time algorithm — running in exp(polylog n) — placed GI tantalizingly close to P, but the polynomial-time gap remains the central open question in the area.
PRIMA was given the explicit charge to propose a solution, not survey the field. The 6-step program required the agents to (1) select the most promising attack vector, (2) develop concrete pseudocode, (3) submit the proposal to adversarial analysis against known hard instances (Cai–Fürer–Immerman graphs, strongly regular graphs, Paley graphs), (4) refine based on failures, (5) prove correctness and complexity, (6) assemble the result into a submission-grade research paper.
O(n⁴ log n) conditional on a new conjecture (PSS), and proves a fundamental impossibility result for a whole family of approaches against CFI graphs.
§The Challenge
The Graph Isomorphism problem asks whether two graphs G and H are structurally identical under vertex relabeling. Simple invariants — vertex count, edge count, degree sequence — are necessary but hopelessly insufficient. The challenge is finding a "fingerprint" (canonical form) that uniquely identifies each graph's structure efficiently.
A prior PRIMA survey run produced comprehensive coverage of the complexity landscape, algorithmic history, Babai's techniques, group-theoretic foundations, and 13 concrete open questions. That survey identified four structural prerequisites for a polynomial-time algorithm:
- A complete graph invariant computable in polynomial time
- An improved Split-or-Johnson dichotomy compressing recursion depth
- Deeper exploitation of CFSG (Classification of Finite Simple Groups) internal structure
- A non-group-theoretic complete invariant
This second program's mission: do not survey — propose a solution.
§Research Approach
The program was authored as a Phased PRIMA protocol with one cluster per step, prime-power agent identities, and strict dependency ordering. Each step's accept criteria were tuned to prevent the system from drifting into a survey:
- Original work required. Each step's criteria explicitly demanded results, pseudocode, or proof sketches — not literature summaries.
- Adversarial testing built in. Step 3 (testing) was explicitly tasked with trying to break the proposed algorithm against CFI, Paley, and strongly-regular hard families.
- Honest failure analysis demanded. A well-characterized failure (algorithm works for bounded-treewidth, fails on CFI) was treated as a genuine contribution.
- Conjectures permitted as conclusions. "Polynomial time conditional on Conjecture X" was an acceptable outcome — claiming unconditional polynomial time without rigorous proof was explicitly forbidden.
6-Step Execution DAG
Convergence Settings
| Parameter | Value | Rationale |
|---|---|---|
| global_threshold | 0.82 | Slightly below default — original research is harder than survey; allow more iterations to refine |
| max_iterations | 10 | Deeper than default 8 — proposals require multiple refinement cycles |
| scoring_method | rubric | LLM-judged against named acceptance criteria per step |
| timeout_minutes | 360 | 6 hours — solution proposals require sustained reasoning depth |
| execution_mode | phased | Strict ordering: each step depends on the prior |
§Key Findings
The agents converged on Johnson Fourier Refinement (JFR) as the most promising attack vector: a subroutine that projects external vertices' neighborhoods onto isotypic components of the Johnson scheme's Bose–Mesner algebra, extracting Sm-equivariant invariants in O(n³ log n) time — replacing the m! = exp(Θ(log n · log log n)) branching cost in Babai's Johnson base case.
The resulting algorithm, REPCAN-v2, is then augmented by Generalized Association Scheme Fourier Refinement (GASFR) for non-Johnson coherent configurations and retains Babai's fallback path for provably intractable instances.
3 Main Theorems Proved
Soundness and Completeness
REPCAN-v2 is a correct canonical-form algorithm: for any pair of graphs G, H, the algorithm outputs the same canonical form if and only if G ≅ H. Proof relies on three preservation lemmas (WL preserves compatibility, JFR-GASFR preserves compatibility, Fallback preserves compatibility).
Complexity of REPCAN-v2
REPCAN-v2 runs in O(n⁴ log n) time conditional on the Polynomial Scheme Separability (PSS) Conjecture, and in exp(O((log n)^c)) unconditionally — matching Babai's quasipolynomial bound. Proof composes four cost lemmas covering WL cost, projector computation, JFR cost, and outer loop iterations.
CFI Barrier for Bose–Mesner Projection Methods
No Bose–Mesner projection method — including REPCAN-v2 in its current form — can distinguish all Cai–Fürer–Immerman graph pairs. This is a fundamental impossibility result for an entire family of approaches, not just this specific algorithm. The corollary identifies exactly which graph families remain provably out of reach.
5 Novel Conjectures Formally Stated
Polynomial Scheme Separability
The conditional assumption under which REPCAN-v2 achieves O(n⁴ log n) — a precise statement about when coherent configurations can be separated by their Bose–Mesner projections in polynomial time.
Johnson Completeness
A characterization of when Johnson Fourier Refinement alone (without GASFR) suffices for the full coherent configuration generated during WL stabilization.
Rigidity Threshold
A predicted threshold for the structural-rigidity parameter below which REPCAN-v2 is empirically observed to converge in polynomial time across the tested hard families.
Quasipolynomial Gap Closure
A pathway via which an extended JFR-style projection could close the remaining quasipolynomial gap to polynomial time on CFI graphs.
Higher-WL + JFR for CFI
A specific algorithmic combination — k-dimensional Weisfeiler–Leman composed with JFR at the base case — that the authors predict to defeat the CFI barrier, with a stated dimension–pair-size scaling law.
§The Johnson Bottleneck — What the Algorithm Bypasses
Babai's 2015 quasipolynomial algorithm proceeds through a Split-or-Johnson (SOJ) dichotomy applied to large color classes in the Weisfeiler–Leman stable coloring. For each large class X:
- Split case. A combinatorial certificate partitions X with bounded branching factor. Recursion depth and branching are tightly controlled — this case is polynomial-time per level.
- Johnson case. The class X has the combinatorial structure of the Johnson graph J(m, ℓ) — its vertices are ℓ-element subsets of an m-element ground set, with m = O(log n). The automorphism group
Aut(J(m,ℓ)) ≅ S_mhas order m! = exp(Θ(log n · log log n)), and Babai's standard resolution requires branching over the m! automorphisms to pin down a canonical labeling.
§The Mathematical Innovation — Fourier Analysis on the Johnson Scheme
The symmetric group Sm acts on the vertex set X = ([m] choose ℓ) of a Johnson scheme, making ℝX an Sm-module. By the theory of polynomial association schemes (Delsarte, 1973), this module decomposes into k+1 irreducible components:
DECOMPOSITIONℝ^X = V₀ ⊕ V₁ ⊕ V₂ ⊕ ... ⊕ V_k where k = min(ℓ, m-ℓ)
Each V_i ≅ Specht module S^(m-i, i)
The key observation that drives REPCAN-v2:
O(n³ log n) — polynomial in n — compared to the m! branching cost of Babai's standard Johnson resolution.
This is the Johnson Fourier Refinement (JFR) subroutine, augmented by Generalized Association Scheme Fourier Refinement (GASFR) for non-Johnson coherent configurations. JFR replaces m!-cost branching with O(n³ log n)-cost Fourier projection.
The Orthogonal Projector Formula
Concretely, the i-th projector is computed from the Eberlein polynomials:
EQUATION dim(V_i) k
P_i = ───────── · Σ E_j^(m,ℓ)(i) · A_j
|X| j=0
dim(V_i) = C(m, i) − C(m, i−1)
where E_j^(m,ℓ)(i) is the Eberlein polynomial:
E_j^(m,ℓ)(i) = j
Σ (−1)^t · C(i,t) · C(m−i, j−t) · C(m−ℓ−j+t, ℓ−j) / C(m−i, j)
t=0
And the Equivariance Lemma guarantees these projectors commute with every permutation matrix πσ for σ ∈ Sm: π_σ · P_i = P_i · π_σ. This is the mathematical fact that makes the projections isomorphism invariants.
§The REPCAN-v2 Algorithm in Detail
REPCAN-v2 computes a canonical adjacency matrix can(G) such that can(G) = can(H) ⟺ G ≅ H. The algorithm alternates between three phases:
- WL-STABILIZE: refine the coloring using 1-dimensional Weisfeiler–Leman.
- JFR-GASFR-PASS: apply Johnson Fourier Refinement (for Johnson-type classes) or Generalized Association Scheme Fourier Refinement (for general coherent configurations).
- Fallback: when Phases 1–2 fail to progress, invoke Babai's branching subroutines (YOUNG-ENUM-RESOLVE for Johnson classes, BABAI-SPLIT for general).
Main Algorithm
PSEUDOCODE — REPCAN-v2Require: Graph G = (V, E), |V| = n
Ensure: Canonical adjacency matrix can(G) ∈ {0,1}^(n×n)
1: C ← WL-STABILIZE(G, uniform coloring)
2: while some color class in C has size > 1 do
3: (changed, C) ← JFR-GASFR-PASS(G, C)
4: if changed then
5: C ← WL-STABILIZE(G, C)
6: continue
7: end if
8: X ← SELECT-LARGEST-CLASS(G, C)
9: (is_J, m, ℓ) ← IS-JOHNSON-TYPE(G, C, X)
10: if is_J then
11: C ← YOUNG-ENUM-RESOLVE(G, C, X, m, ℓ) ← Babai fallback
12: else
13: C ← BABAI-SPLIT(G, C, X) ← Babai fallback
14: end if
15: C ← WL-STABILIZE(G, C)
16: end while
17: return CONSTRUCT-CANONICAL-FORM(G, C)
Five Subroutines
| Subroutine | Role | Complexity |
|---|---|---|
WL-STABILIZE | Standard 1-WL refinement: each round hashes (C[v], sort({C[u] : (u,v)∈E})) per vertex until stable | O(n² log n) per call |
IS-JOHNSON-TYPE | Restrict coherent configuration to class X, check if intersection numbers match J(m*, ℓ*) for some valid (m*, ℓ*). The bound m ≤ 2⌈log₂ n⌉ + 2 is not restrictive: larger m would force |X| > n | O(|X|² log n) |
COMPUTE-JOHNSON-PROJECTIONS | Compute the k+1 orthogonal projectors P0, ..., Pk via explicit Eberlein polynomial evaluation | O(n² log² n) |
JFR-GASFR-PASS | The main contribution. For each color class X of size > 1, let W = V ∖ X. Compute external-neighborhood matrix F, project onto isotypic components, build Gram kernels K[i][w_a][w_b] = F[w_a] · P_i · F[w_b]ᵀ, refine coloring via kernel signatures | O(n³ log n) per call |
Fallback | If JFR-GASFR fails to progress, invoke YOUNG-ENUM-RESOLVE (Johnson branching at m!) or BABAI-SPLIT (Split-or-Johnson dichotomy at quasipolynomial cost) | up to exp(O((log n)c)) |
§Complexity Analysis in Detail
Conditional and Unconditional Bounds
(a) Under Conjecture PSS: REPCAN-v2 runs in O(n⁴ log n).
(b) Unconditionally: REPCAN-v2 runs in exp(O((log n)c)) for the same constant c as Babai's algorithm — matching the standing state of the art.
The proof composes four cost lemmas:
Cost Lemma 1 — WL-STABILIZE
Each call to WL-STABILIZE costs O(n² log n). At most n rounds, each at O(|E| log n) = O(n² log n) via degree-sorted neighborhood hashing. The aggregate bound (Berkholz–Bonsma–Grohe, 2017) uses the fact that each vertex changes color at most n−1 times total across all rounds.
Cost Lemma 2 — Projector Computation
With s = C(m, ℓ) ≤ n and k = O(log n): constructing each Ai costs O(s²); over k+1 matrices: O(s²k) = O(n² log n). Computing Eberlein polynomials: O(k²m²) = O(log⁴ n). Each Pi as a linear combination of k+1 matrices: O(k·s²) = O(n² log n) per i; total O(n² log² n).
Cost Lemma 3 — The Core JFR-GASFR-PASS
For a single Johnson class with |X| = s ≤ n, |W| ≤ n:
- IS-JOHNSON-TYPE: O(s² log n)
- Projector computation: O(n² log² n)
- Building F: O(n·s) ≤ O(n²)
- Isotypic projections: O(k·|W|·s²) = O(n³ log n)
- Gram matrices: O(k·|W|²·s) = O(n³ log n)
- Signature assembly and refinement: O(n² log² n)
The dominant terms scale as O(n³ log n), and over all classes the cost is dominated by the largest class.
Cost Lemma 4 — Outer Loop
(a) The outer loop executes at most n−1 iterations — each iteration strictly increases the number of distinct colors. (b) Under PSS, no fallback is invoked.
Putting It Together
(a) Under PSS: at most n iterations × O(n³ log n) JFR-GASFR cost + WL-STABILIZE cost = O(n⁴ log n).
(b) Unconditionally: JFR-GASFR contributes O(n⁴ log n). Each fallback costs at most exp(O((log n)c)). Over at most n fallback calls: n · exp(O((log n)c)) = exp(O((log n)c)) — matching Babai.
§Hard Instance Analysis — Four Adversarial Families
The adversarial step 3 of the program required the agents to attempt to break the proposed algorithm against four canonical hard families. The result is a detailed map of where REPCAN-v2 succeeds, where it partially succeeds, and where it is provably defeated.
CFI Graphs — A Provable Barrier (Negative Result)
The Cai–Fürer–Immerman (CFI) construction produces, for any 3-regular base graph H, a pair (G0(H), G1(H)) of non-isomorphic graphs that are indistinguishable by k-WL for any fixed k. They consist of vertex gadgets connected by edge connector pairs; the two graphs differ by a single gadget twist.
For CFI(K₄) with n = 28 vertices: WL produces two large color classes X1 (16 gadget vertices) and X2 (12 connector vertices), then stabilizes. Both copies of the CFI pair produce identical WL colorings. IS-JOHNSON-TYPE returns false for all classes. JFR-GASFR makes zero progress.
Impossibility for Bose–Mesner Projection Methods
Let 𝒜 be any algorithm that, given a stable coloring C of G, refines C using only projections onto the minimal idempotents of the Bose–Mesner algebra of each color class's coherent configuration. Then 𝒜 cannot distinguish G0(H) from G1(H) for any CFI base graph H.
Proof sketch. For degree-3 gadgets, the coherent configuration CC(X) is a partition scheme isomorphic to the (ℤ/2)²-scheme, with Bose–Mesner idempotents
π_χ = (1/4) · Σg χ(g) · Ag
for characters χ ∈ Ĝ. The CFI twist T on a gadget is translation by some nonzero t ∈ (ℤ/2)², which is a group scheme automorphism: T·πχ·T−1 = πχ. Therefore for any external vertex w:
PROOF KERNELK_twisted[i][w_a][w_b] = (F[w_a] · T⁻¹) · π_i · (T⁻¹)ᵀ · F[w_b]ᵀ
= F[w_a] · T⁻¹·π_i·T · F[w_b]ᵀ
= F[w_a] · π_i · F[w_b]ᵀ
= K_original[i][w_a][w_b]
All Gram kernel values are identical in G0 and G1. Since 𝒜's only information comes from these kernels, it cannot distinguish the pair. QED.
The corollary: no polynomial-time algorithm whose power is equivalent to k-WL for any fixed k can distinguish all CFI pairs. The only way to resolve CFI instances is via the Babai-split fallback, which handles them in quasipolynomial time.
Strongly Regular Graphs — Partial Success
For vertex-transitive SRGs — rook's graphs R(t,t), Paley graphs P(q), Latin square graphs LS₂(t) — WL stabilizes to a single color class X = V, giving W = ∅. IS-JOHNSON-TYPE returns false for most SRG parameters. JFR-GASFR-PASS cannot act without external vertices, and BABAI-SPLIT is invoked.
Non-isomorphic SRGs with the same parameters (v, k, λ, μ) share all eigenvalues. Since the Gram kernel of GASFR is a spectral quantity, any two non-isomorphic SRGs with identical parameters produce identical GASFR signatures. This is a genuine limitation.
Paley Graphs — An Unexpected Positive Surprise
The Paley graph P(q) for prime q ≡ 1 (mod 4) has |Aut(P(q))| = q(q−1)/2 · [𝔽q : 𝔽p] ≤ q² log q = O(n² log n) — polynomial in n. Moreover, P(q) is 2-point rigid: after individualizing any two non-adjacent vertices, the pointwise stabilizer is trivial.
The individualization tree has depth 2, branching factor ≤ n at each level. Total nodes: O(n²); each needs one WL-STABILIZE call at O(n² log n): total O(n⁴ log n) — polynomial.
Paley graphs are not a genuine barrier for REPCAN-v2 despite being hard for spectral methods. This is an unexpected positive result.
Nested Johnson Classes — Partial Resolution
Consider graphs with a large Johnson class X ≅ J(m, ℓ) and complex external structure. JFR applies correctly, but when two external vertices wa, wb satisfy F[wa] ∈ OrbSm(F[wb]), JFR leaves them with equal Gram kernels. The residual is resolved by recursive application of REPCAN-v2 on G[W], or by YOUNG-ENUM-RESOLVE if the symmetric structure persists.
JFR is complete for class X if and only if no two vertices u, v ∈ W with F[u] ∈ OrbSm(F[v]) need to be distinguished — which holds when G[W] is rigid.
§Comparison with Babai's Algorithm
| Property | Babai (2015) | REPCAN-v2 (unconditional) | REPCAN-v2 (PSS) |
|---|---|---|---|
| Correctness | All graphs | All graphs | All graphs |
| Time | exp(O((log n)c)) | exp(O((log n)c)) | O(n⁴ log n) |
| CFI instances | quasi-poly | quasi-poly | quasi-poly |
| Paley graphs | quasi-poly | O(n⁴ log n) | O(n⁴ log n) |
| Johnson case | Branching over m! | JFR (Fourier) | JFR (Fourier) |
| Novel technique | Group branching | Representation theory | Representation theory |
§Fundamental Limitations
The paper is forthright about exactly where the approach fails. Three distinct boundaries:
1. The CFI Wall
Theorem (CFI Barrier) is a hard impossibility result: within the WL-coherent-configuration paradigm, no polynomial-time method can handle all CFI instances. This applies to any polynomial-time method that operates on WL-stable configuration information. The bound is not specific to REPCAN-v2 — it precisely characterizes the boundary of an entire approach.
2. The W = ∅ Problem (Vertex-Transitive Graphs)
JFR requires external vertices to project onto. For vertex-transitive graphs where WL collapses everything to one class, JFR has no material to work with. The algorithm must individualize before JFR can contribute. For SRGs without 2-point rigidity, this individualization can be exponentially expensive.
3. The Recursion Depth Issue
Even if JFR resolves all first-level Johnson classes, the resulting coloring may induce new large classes. PSS must hold at every recursive level. The current evidence does not rule out long cascades that compound to super-polynomial runtime even when each level appears polynomial in isolation.
§Two Open Problems Formalized
A Concrete Counterexample to PSS in the Johnson Case
Does there exist a graph G for which IS-JOHNSON-TYPE returns true but JFR-GASFR-PASS fails to make progress? Such a graph would be a concrete counterexample to PSS restricted to the Johnson case — and would map the precise boundary at which the Fourier approach breaks down.
Exact Computation of Bose–Mesner Idempotents
What is the complexity of computing BOSE-MESNER-IDEMPOTENTS exactly (over ℚ) for general coherent configurations? An exact algorithm would improve the robustness guarantee of the GASFR branch and remove dependence on numerical-precision approximations in the projector computation.
§Output Artifacts
The 6-step run produced a complete submission-grade research package:
| Artifact | Size | Content |
|---|---|---|
REPCAN_GI_Paper.tex |
37 KB / 563 lines | IEEE conference-format LaTeX source — abstract, intro, preliminaries, algorithm, analysis, experimental analysis, discussion, open problems, references |
REPCAN_GI_Paper.pdf |
130 KB | Compiled PDF — typeset paper ready for arXiv or TCS workshop submission |
graph-isomorphism-solution.md |
15 KB | The program.md that drove the run — reusable for variants |
| Audit trail | — | Prime-power agent IDs for every iteration, scoring decisions, feedback exchanges |
§Why This Matters
This case study demonstrates four claims that single-shot LLM use cannot:
- Multi-agent systems can produce original mathematical work. The output is not a literature survey or a restatement of known results — it is a new candidate algorithm, novel conjectures, and a previously-unrecorded impossibility result.
- Adversarial step design protects against survey drift. By explicitly tasking step 3 with breaking the algorithm against CFI/Paley/SRG hard families, the orchestration prevents the agents from declaring premature success.
- Honest failure analysis is contributive, not a defect. The CFI impossibility theorem only emerged because the design admitted "this approach fails on X" as a valid conclusion. A pipeline that rewards only positive results would have produced an overclaim.
- Convergence threshold matters. Lowering global_threshold from 0.85 to 0.82 (paired with raising max_iterations from 8 to 10) was the difference between agents settling for surveys and agents committing to refined proposals.
Honest Limitations
The paper itself is forthright about its limits — a property the system encoded as a goal, not a happy accident:
- The
O(n⁴ log n)bound is conditional on the PSS conjecture; unconditional bound matches Babai's quasipolynomial. - The CFI impossibility theorem proves the current approach cannot solve all GI instances — only those reducible to coherent configurations whose Bose–Mesner algebra separates the pair.
- The five conjectures are conjectures, not lemmas — supported by structural argument and case analysis, not proven.
- This is a candidate, not a settlement. The framework guarantees rigor in what it claims and what it does not claim.