Mikk
Getting Started

Core Concepts

The Mesh, Merkle-tree hashing, AI Context Builder, Intent Engine, Safety Gates, Decision Engine, Risk Engine, and Auto-Correction — the foundational primitives of Mikk.

The Mesh

The Mesh is Mikk's central data structure — a directed graph where nodes are files, functions, classes, generics, and variables, and edges are imports, calls, containment, inheritance, and access relationships.

Two-pass construction — O(n) build time

Pass 1 — Nodes: Every file, function, class, generic declaration, and variable is registered.

Pass 2 — Edges: Import, call, containment, extends, implements, and access edges are resolved against the full node index.

The result is both outEdges and inEdges adjacency maps — enabling O(1) traversal in either direction.

auth/login.ts
  → imports → crypto/jwt.ts           (forward: what I depend on)
  ← imported by ← api/routes.ts       (reverse: who depends on me)

What Gets Parsed

Mikk uses OXC (Rust-backed, significantly faster than the TypeScript Compiler API) for TypeScript and JavaScript, a native Go extractor for Go, and Tree-sitter for all other supported languages.

LanguageParserWhat is extracted
TypeScript / JavaScriptOXCFunctions, classes, generics, imports, exports, call graph, routes, variables
GoNativeFunctions, structs, imports, methods, call graph
Python, Java, Kotlin (.kt, .kts), C#, Rust, C++, C, PHP, Ruby, SwiftTree-sitterFunctions, classes, imports, call graph

Every function stores its exact file path, line range, call list, hash, and purpose. When your AI calls mikk_get_function_detail, it receives the real body — not a guess.

ID Format

Function IDs are stable and unambiguous. No line numbers — renaming line 42 to line 50 does not invalidate caches.

fn:<absolute-posix-path>:<FunctionName>
fn:<absolute-posix-path>:<FunctionName>#2   (same-name collision — second occurrence)
class:<absolute-posix-path>:<ClassName>
type:<absolute-posix-path>:<TypeName>

Merkle-Tree Hashing

Mikk computes SHA-256 hashes at every level of the codebase hierarchy:

function hash  →  file hash  →  module hash  →  root hash

One root hash = full drift detection

If the root hash matches the stored hash, nothing changed — zero file reads required. If it doesn't match, the changed subtree is pinpointed by walking down the tree.

Incremental Analysis

Hash changed files

Only files modified since last run are re-hashed (mtime + content comparison).

Compare against stored hashes

Lock file hashes compared in O(1) — no disk reads for unchanged files.

Re-parse only mismatches

Only files with changed hashes are re-parsed via OXC.

Propagate up the tree

Hash changes bubble up: function → file → module → root.

Atomic write

Updated mikk.lock.json is written via temp file → rename. Zero corruption risk on crash.

ADR updates in mikk.json follow the same atomic-write pattern.

A 100-file change in a 10,000-file project re-parses exactly 100 files.


Module Clusters

Mikk groups files into logical modules using greedy agglomeration — analyzing import coupling, directory structure, and naming conventions. Each module has a confidence score.

mikk.json (auto-generated, then edited by you)
{
  "declared": {
    "modules": [
      {
        "id": "auth",
        "name": "Authentication",
        "description": "JWT auth and session management",
        "paths": ["src/auth/**"],
        "entryFunctions": ["login", "validateToken"]
      }
    ],
    "constraints": [
      "auth must not import from payments"
    ]
  }
}

Modules can be auto-detected via mikk init or manually declared in mikk.json.


AI Context Builder

The Context Builder turns a task description into a structured, token-budgeted payload for your LLM using BFS graph traversal.

Seed

Task keywords are matched against function names, purposes, and module intents using BM25 + fuzzy matching.

Walk

BFS traces the call graph outward from seed functions. Configurable maxHops (default: 4, max: 12).

Score

Each function receives a composite score: graph proximity + keyword match + entry-point bonus + module relevance.

Budget

Greedy knapsack: highest-scoring functions packed within the token limit. Token cost pre-computed from source bodies.

Format

Output formatted per provider: XML tags for Claude · plain text for generic · compact for tight budgets.

Strict Mode

mikk_query_context supports a strict flag that restricts results to exact keyword matches only, with autoFallback: true (default) retrying balanced mode when strict returns nothing.


Intent Engine

The Intent Engine is the safety layer that validates and guards every edit before it lands. It has five cooperating components.


1. Intent Understanding

Analyzes commit messages, branch names, and change patterns to determine whether breaking changes are intentional. Confidence is derived from:

  • Explicit markers in commit messages: BREAKING:, REFACTOR:, MIGRATION:
  • Branch naming patterns: refactor/, breaking/, v2/
  • Change pattern analysis: systematic renames, signature changes, new exports
  • Migration ADRs declared in mikk.json
# Intentional — explicit marker present
git commit -m "BREAKING: rename verifyToken to validateJwt"

# Risky — exported API changed without explicit intent
# → IntentUnderstanding flags this as unintentional breaking change

2. Safety Gates

Six enforced gates run before any edit is applied. Each gate returns canProceed, a severity (BLOCKING or WARNING), and a bypass command when applicable.

GateDefault behaviourBypassable
RISK_SCOREBLOCK at risk ≥ 90; WARN when risk > maxRiskScore policyYes (except ≥ 90)
IMPACT_SCALEBLOCK when impacted functions > maxImpactNodes × 2; WARN when > maxImpactNodesYes
PROTECTED_MODULEBLOCK if any protected module is touchedNever
BREAKING_CHANGEBLOCK exported API changes without BREAKING: commit markerYes
TEST_COVERAGEBLOCK high-risk changes without test file modificationsYes
DOCUMENTATIONBLOCK significant API changes without doc file updatesYes

Configure thresholds in mikk.json under policies:

{
  "policies": {
    "maxRiskScore": 70,
    "maxImpactNodes": 10,
    "protectedModules": ["auth", "billing"],
    "enforceStrictBoundaries": true,
    "requireTestsForChangedFiles": true,
    "requireDocumentationForApiChanges": false
  }
}

3. Decision Engine

Evaluates impact analysis results against policies and returns a three-state verdict:

StatusMeaning
APPROVEDRisk score within policy, no violations
WARNINGRisk or impact exceeds a policy threshold — review recommended
BLOCKEDCritical risk (≥ 90), protected module touched, strict boundaries violated

The Decision Engine checks:

  1. Absolute risk threshold (≥ 90 always blocks)
  2. Impact node count vs maxImpactNodes policy
  3. Protected modules — uses riskScore to identify high/critical nodes, then matches against file paths
  4. Strict boundary enforcement for critical cross-module hits

4. Risk Engine

Computes a quantitative risk score (0–100) per function based on its structural position in the graph:

score = (connectedNodes × 1.5) + (dependencyDepth × 2)
      + 30 (if auth/security keywords in name or file)
      + 20 (if database/state keywords in name or file)
      + 15 (if function is exported / public API)

Score is clamped to [0, 100]. The engine is instantiated once per compile run — not per function — to avoid O(n) BFS repetition.


5. Auto-Correction Engine

Detects and auto-fixes common issues without AI involvement:

Issue typeDetectionFix
Broken referencesCall targets absent from lockFinds nearest Levenshtein-distance match (threshold: 3); replaces whole-word occurrences in source
Missing importsResolved import path absent from lockFinds moved file by basename lookup; replaces path in import statement
Boundary violationsCross-module calls that violate constraintsPrepends a // TODO [mikk]: Boundary violation comment with adapter suggestion

All file writes are path-traversal guarded — only files inside projectRoot are modified. The engine returns a full CorrectionResult with appliedFixes, failedFixes, and filesModified.


Confidence Engine

The Confidence Engine computes path-level confidence for impact analysis. Each edge in the graph carries a confidence score (1.0 for direct AST-confirmed calls, 0.6–0.8 for resolved imports). Path confidence is the product of edge confidences along the traversal path.

confidence(path) = edge₁.confidence × edge₂.confidence × … × edgeₙ.confidence

The engine checks both outEdges and inEdges when resolving path connections, ensuring correctness regardless of traversal direction.


AI Context Files

Mikk generates two AI context files that give your AI agents architecture-aware context:

FileForContents
claude.mdClaude Desktop/CursorModules, functions, call graph, routes
AGENTS.mdOpenCode/OpenClaw/CopilotSame content with different formatting

Key differences from RAG:

  • Graph-traced: Follows call edges from relevant functions
  • Token-budgeted: Respects your token limit
  • Auto-updated: Regenerates on mikk analyze

Was this page helpful?

On this page