concensure's comments

concensure · 2026-03-31T03:50:07 1774929007

The Problem: Most RAG-based coding tools treat code as unstructured text, relying on probabilistic vector search that often misses critical functional dependencies. This leads to the "Edit-Fail-Retry" loop, where the LLM consumes more time and money through repeated failures.

The Solution: Semantic uses a local AST (Abstract Syntax Tree) parser to build a Logical Node Graph of the codebase. Instead of guessing what is relevant, it deterministically retrieves the specific functional skeletons and call-site signatures required for a task. The Shift: From "Token Savings" to "Step Savings"

Earlier versions of this project focused on minimizing tokens per call. However, our latest benchmarks show that investing more tokens into high-precision context leads to significantly fewer developer intervention steps. Latest A/B Benchmark (2026-03-27)

    Provider: OpenAI (gpt-4o / o1)

    Suite: 11-task core suite (atomic coding tasks)

    Configuration: autoroute_first=true, single_file_fast_path=false

Run Variant Token Delta (per call) Step Savings (vs Baseline) Task Success Baseline (2026-03-13) -18.62% — 11/11 Hardened A +8.07% — 11/11 Enhanced (2026-03-27) -6.73% +27.78% 11/11 Key Takeaways:

    The ROI of Precision: While the "Enhanced" run used roughly 6.73% more tokens than the baseline per request, it required 27.78% fewer steps to reach a successful solution.

    Deterministic Accuracy: By feeding the LLM a "Logical Skeleton" rather than fuzzy similarity-search chunks, we eliminate the "lost in the middle" effect. The agent understands the consequences of an edit before it writes a single line.

    Context Density: We are effectively trading cheap input tokens for expensive developer time and agent compute cycles.

Detailed breakdowns of the task suite and methodology are available in docs/AB_TEST_DEV_RESULTS.md.