Core Concepts
The Mesh, Merkle-tree hashing, AI Context Builder, Intent Engine, Safety Gates, Decision Engine, Risk Engine, and Auto-Correction — the foundational primitives of Mikk.
The Mesh
The Mesh is Mikk's central data structure — a directed graph where nodes are files, functions, classes, generics, and variables, and edges are imports, calls, containment, inheritance, and access relationships.
Two-pass construction — O(n) build time
Pass 1 — Nodes: Every file, function, class, generic declaration, and variable is registered.
Pass 2 — Edges: Import, call, containment, extends, implements, and access edges are resolved against the full node index.
The result is both outEdges and inEdges adjacency maps — enabling O(1) traversal in either direction.
auth/login.ts
→ imports → crypto/jwt.ts (forward: what I depend on)
← imported by ← api/routes.ts (reverse: who depends on me)What Gets Parsed
Mikk uses OXC (Rust-backed, significantly faster than the TypeScript Compiler API) for TypeScript and JavaScript, a native Go extractor for Go, and Tree-sitter for all other supported languages.
| Language | Parser | What is extracted |
|---|---|---|
| TypeScript / JavaScript | OXC | Functions, classes, generics, imports, exports, call graph, routes, variables |
| Go | Native | Functions, structs, imports, methods, call graph |
Python, Java, Kotlin (.kt, .kts), C#, Rust, C++, C, PHP, Ruby, Swift | Tree-sitter | Functions, classes, imports, call graph |
Every function stores its exact file path, line range, call list, hash, and purpose. When your AI calls mikk_get_function_detail, it receives the real body — not a guess.
ID Format
Function IDs are stable and unambiguous. No line numbers — renaming line 42 to line 50 does not invalidate caches.
fn:<absolute-posix-path>:<FunctionName>
fn:<absolute-posix-path>:<FunctionName>#2 (same-name collision — second occurrence)
class:<absolute-posix-path>:<ClassName>
type:<absolute-posix-path>:<TypeName>Merkle-Tree Hashing
Mikk computes SHA-256 hashes at every level of the codebase hierarchy:
function hash → file hash → module hash → root hashOne root hash = full drift detection
If the root hash matches the stored hash, nothing changed — zero file reads required. If it doesn't match, the changed subtree is pinpointed by walking down the tree.
Incremental Analysis
Hash changed files
Only files modified since last run are re-hashed (mtime + content comparison).
Compare against stored hashes
Lock file hashes compared in O(1) — no disk reads for unchanged files.
Re-parse only mismatches
Only files with changed hashes are re-parsed via OXC.
Propagate up the tree
Hash changes bubble up: function → file → module → root.
Atomic write
Updated mikk.lock.json is written via temp file → rename. Zero corruption risk on crash.
ADR updates in mikk.json follow the same atomic-write pattern.
A 100-file change in a 10,000-file project re-parses exactly 100 files.
Module Clusters
Mikk groups files into logical modules using greedy agglomeration — analyzing import coupling, directory structure, and naming conventions. Each module has a confidence score.
{
"declared": {
"modules": [
{
"id": "auth",
"name": "Authentication",
"description": "JWT auth and session management",
"paths": ["src/auth/**"],
"entryFunctions": ["login", "validateToken"]
}
],
"constraints": [
"auth must not import from payments"
]
}
}Modules can be auto-detected via mikk init or manually declared in mikk.json.
AI Context Builder
The Context Builder turns a task description into a structured, token-budgeted payload for your LLM using BFS graph traversal.
Seed
Task keywords are matched against function names, purposes, and module intents using BM25 + fuzzy matching.
Walk
BFS traces the call graph outward from seed functions. Configurable maxHops (default: 4, max: 12).
Score
Each function receives a composite score: graph proximity + keyword match + entry-point bonus + module relevance.
Budget
Greedy knapsack: highest-scoring functions packed within the token limit. Token cost pre-computed from source bodies.
Format
Output formatted per provider: XML tags for Claude · plain text for generic · compact for tight budgets.
Strict Mode
mikk_query_context supports a strict flag that restricts results to exact keyword matches only, with autoFallback: true (default) retrying balanced mode when strict returns nothing.
Intent Engine
The Intent Engine is the safety layer that validates and guards every edit before it lands. It has five cooperating components.
1. Intent Understanding
Analyzes commit messages, branch names, and change patterns to determine whether breaking changes are intentional. Confidence is derived from:
- Explicit markers in commit messages:
BREAKING:,REFACTOR:,MIGRATION: - Branch naming patterns:
refactor/,breaking/,v2/ - Change pattern analysis: systematic renames, signature changes, new exports
- Migration ADRs declared in
mikk.json
# Intentional — explicit marker present
git commit -m "BREAKING: rename verifyToken to validateJwt"
# Risky — exported API changed without explicit intent
# → IntentUnderstanding flags this as unintentional breaking change2. Safety Gates
Six enforced gates run before any edit is applied. Each gate returns canProceed, a severity (BLOCKING or WARNING), and a bypass command when applicable.
| Gate | Default behaviour | Bypassable |
|---|---|---|
RISK_SCORE | BLOCK at risk ≥ 90; WARN when risk > maxRiskScore policy | Yes (except ≥ 90) |
IMPACT_SCALE | BLOCK when impacted functions > maxImpactNodes × 2; WARN when > maxImpactNodes | Yes |
PROTECTED_MODULE | BLOCK if any protected module is touched | Never |
BREAKING_CHANGE | BLOCK exported API changes without BREAKING: commit marker | Yes |
TEST_COVERAGE | BLOCK high-risk changes without test file modifications | Yes |
DOCUMENTATION | BLOCK significant API changes without doc file updates | Yes |
Configure thresholds in mikk.json under policies:
{
"policies": {
"maxRiskScore": 70,
"maxImpactNodes": 10,
"protectedModules": ["auth", "billing"],
"enforceStrictBoundaries": true,
"requireTestsForChangedFiles": true,
"requireDocumentationForApiChanges": false
}
}3. Decision Engine
Evaluates impact analysis results against policies and returns a three-state verdict:
| Status | Meaning |
|---|---|
APPROVED | Risk score within policy, no violations |
WARNING | Risk or impact exceeds a policy threshold — review recommended |
BLOCKED | Critical risk (≥ 90), protected module touched, strict boundaries violated |
The Decision Engine checks:
- Absolute risk threshold (≥ 90 always blocks)
- Impact node count vs
maxImpactNodespolicy - Protected modules — uses
riskScoreto identify high/critical nodes, then matches against file paths - Strict boundary enforcement for critical cross-module hits
4. Risk Engine
Computes a quantitative risk score (0–100) per function based on its structural position in the graph:
score = (connectedNodes × 1.5) + (dependencyDepth × 2)
+ 30 (if auth/security keywords in name or file)
+ 20 (if database/state keywords in name or file)
+ 15 (if function is exported / public API)Score is clamped to [0, 100]. The engine is instantiated once per compile run — not per function — to avoid O(n) BFS repetition.
5. Auto-Correction Engine
Detects and auto-fixes common issues without AI involvement:
| Issue type | Detection | Fix |
|---|---|---|
| Broken references | Call targets absent from lock | Finds nearest Levenshtein-distance match (threshold: 3); replaces whole-word occurrences in source |
| Missing imports | Resolved import path absent from lock | Finds moved file by basename lookup; replaces path in import statement |
| Boundary violations | Cross-module calls that violate constraints | Prepends a // TODO [mikk]: Boundary violation comment with adapter suggestion |
All file writes are path-traversal guarded — only files inside projectRoot are modified. The engine returns a full CorrectionResult with appliedFixes, failedFixes, and filesModified.
Confidence Engine
The Confidence Engine computes path-level confidence for impact analysis. Each edge in the graph carries a confidence score (1.0 for direct AST-confirmed calls, 0.6–0.8 for resolved imports). Path confidence is the product of edge confidences along the traversal path.
confidence(path) = edge₁.confidence × edge₂.confidence × … × edgeₙ.confidenceThe engine checks both outEdges and inEdges when resolving path connections, ensuring correctness regardless of traversal direction.
AI Context Files
Mikk generates two AI context files that give your AI agents architecture-aware context:
| File | For | Contents |
|---|---|---|
claude.md | Claude Desktop/Cursor | Modules, functions, call graph, routes |
AGENTS.md | OpenCode/OpenClaw/Copilot | Same content with different formatting |
Key differences from RAG:
- Graph-traced: Follows call edges from relevant functions
- Token-budgeted: Respects your token limit
- Auto-updated: Regenerates on
mikk analyze
Was this page helpful?