❯ Perfect Claude! Everything is working great. | Create a Blog Post for our Advancements please, thank you! xoxo
◐ The user wants me to create a blog post about the advancements made in the SCYTHE/NerfEngine project. Let me read the plan and recent
checkpoints to understand the full scope of what's been built, then write an engaging blog post.
Let me write an engaging blog post that covers the full journey of the NerfEngine / RF
SCYTHE project. I'll make it technical but readable, covering:
1. The overall vision
2. The Android app journey (ATAK plugin → ScytheCommandApp)
3. The hypergraph/analytics engine
4. The inference pipeline (Gemma/EmbeddingGemma)
5. The shadow graph
6. The Threat Gravity Map
7. What's next
- Mixed tone (technical sections with punchy intros/outros)
- General cybersecurity / infosec audience
- Personal blog / website
- Include code snippets
Let me write a comprehensive, engaging blog post that covers the full SCYTHE journey across all 23 checkpoints. I'll organize it into major
milestone phases and make it feel like a genuine development journey narrative with technical depth. >>
# Building SCYTHE: From Android Plugin to Living Intelligence Organism
> *How we turned a tactical RF sensor tool into a self-healing, graph-cognitive threat intelligence engine — one late night at a time.*
—
## The Origin: A Map, a Radio, and a Crazy Idea
It started with a simple enough premise: take [ATAK (Android Tactical Assault Kit)](https://github.com/deptofdefense/AndroidTacticalAssaultKit-CIV), the U.S. Department of Defense’s open-source situational awareness platform, and wire it up to a real-time RF signal intelligence backend.
The name: **SCYTHE** — a system for cross-domain signals, hypergraph analytics, and threat environment mapping.
What followed was 23 engineering checkpoints, hundreds of log lines, three complete architectural pivots, and a system that now does things we didn’t plan for when we started.
Here’s the full story.
—
## Phase 1 — The Plugin That Almost Wasn’t
The original approach was straightforward: build an ATAK plugin that streams CoT (Cursor-on-Target) entity events into a Python analytics backend. Four tabs, RF signal dots on a map, swarm detection.
We built it. It worked.
Then we deployed to a Pixel 7 Pro running **Android 16**.
“`
FATAL: libsqlite3.so not found
Process: com.atakmap.app.civ, Signal: SIGABRT
“`
ATAK CIV 4.6.0 was compiled against pre-bionic-hardening ABI. Android 16 removed `libsqlite3.so` from the system image. The app crashed before it could load a single plugin.
**The plugin worked perfectly. The host app was the problem.**
Rather than wait for TAK.gov to ship ATAK 5.x, we pivoted to a **standalone WebView shell** — `ScytheCommandApp` — loading the full 35K-line `command-ops-visualization.html` directly from the orchestrator. All the Cesium 3D globe, RF hypergraph visualization, and recon entity rendering reused immediately. Zero GDAL fragility.
*The lesson: don’t fight the platform. Route around it.*
—
## Phase 2 — The Intelligence Backend: Hypergraph Analytics Engine
While the Android story was playing out, the Python backend was growing into something much larger.
The core data model is a **hypergraph**: nodes are entities (hosts, sessions, RF signatures, network flows), and edges encode *relationships* — not just “A connected to B” but typed, confidence-weighted assertions like `INFERRED_LATERAL_MOVEMENT`, `TRIANGULATED_FROM`, `CO_MOVED_WITH`.
“`
HGNode(id=”172.234.197.23″, kind=”ip_host”)
└─ INFERRED_FLOW_IN_SERVICE → HGNode(“10.0.0.44″, kind=”ip_host”)
└─ INFERRED_LATERAL_MOVEMENT → HGNode(“SESSION-911b32eb”, kind=”session”)
└─ CO_MOVED_WITH → HGNode(“RF-A4:B1:…”, kind=”rf_emitter”)
“`
On top of this we built:
– **ScytheDuckStore** — DuckDB event store, 54ms bulk insert / 10ms scrub
– **ParquetPipeline** — ZSTD Parquet cold storage, 3× compression
– **SpaceTimeCube** — 0.01° geo × 5s time voxels, 0.14ms radius+time queries
– **ClusterSwarmEngine** — geo-bucket cluster detection → CoT swarm events
– **TakSwarmEmitter** — PyTAK CoT emitter back into the ATAK ecosystem
This ran inside an **AVF (Android Virtualization Framework)** Debian VM — 8-core Tensor G2, 3.8GB RAM, 70GB disk — provisioned inside the Pixel itself via crosvm.
—
## Phase 3 — The Intelligence Layer: Gemma + EmbeddingGemma
Here’s where things got interesting.
SCYTHE doesn’t just store events. It *reasons about them*. We wired in two local Ollama models:
– **Gemma** (inference model) — given a graph node and a set of rule prompts, produce new inferred edges
– **EmbeddingGemma** — produce semantic embeddings for edge kind repair and entity clustering
### The Rule Prompt Architecture
Each inference cycle works like this:
1. Pick a graph node (typically a recently-active host or session)
2. Build a structured prompt: “Here are 12 known edge kinds. Here is what we know about this entity. What edges should exist?”
3. Gemma returns structured JSON — a list of `{ rule_id, should_fire, inferred_edges: […] }`
4. Validator checks each edge against the schema
5. Valid edges → committed to hypergraph via `WriteBus`
6. Invalid edges → shadow graph (more on this later)
### The Silent Killers We Fixed
After running this under real load, we discovered four concurrent failure modes that were destroying pipeline throughput silently:
**1. Model Hallucination as Edge Kinds**
Gemma was emitting things like:
“`
kind: “session_observed_SESSION-8abf8b865c15fce3”
“`
The model was hallucinating *session IDs as edge schema types*. The validator was correctly rejecting these, but then wasting compute cycles trying to semantically repair them.
Fix: added a regex pattern match in `normalize_edge_kind` — anything matching `SESSION_OBSERVED_<hex>` gets a schema-policy drop (empty string return, logged at DEBUG, no repair attempt):
“`python
if re.match(r’^SESSION_OBSERVED_[0-9a-f]{8,}’, kind_upper):
return “” # schema-policy drop — model hallucinated a session ID as a kind
“`
**2. JSON-String-Wrapped Dicts**
Sometimes Gemma would return a valid JSON structure, but one of the items in the list would itself be a *JSON-encoded string* rather than a dict:
“`python
# What we got:
[“{\”rule_id\”: \”R-0001\”, \”inferred_edges\”: […]}”, {…}, …]
# What we expected:
[{“rule_id”: “R-0001”, …}, …]
“`
Fix: coerce in `_validate_gemma_output_inner`:
“`python
if isinstance(item, str):
try:
item = json.loads(item)
except json.JSONDecodeError:
continue # drop malformed string
“`
**3. Missing `rule_id` / `should_fire`**
Timeout-corrupted responses would sometimes have `inferred_edges` but no `rule_id`. The validator discarded the entire item.
Fix: synthesize a recovery stub if edges are present:
“`python
if “rule_id” not in item and “inferred_edges” in item:
item[“rule_id”] = “R-RECOVERED”
item[“should_fire”] = True
“`
**4. The 15-Minute Gemma Blackout**
The worst one. With the GPU in a degraded state:
– Timeout: 300s per attempt
– Retries: 3
– **Worst case: 900 seconds (15 minutes) of blocked inference per node**
The entire pipeline would stall. No logs, no errors — just silence.
**Fix: Circuit Breaker**
“`python
class GemmaCircuitBreaker:
“””Opens after 3 consecutive failures. Cooldown: 60s. Half-open probe on recovery.”””
def __init__(self, threshold=3, cooldown=60):
self.threshold = threshold
self.cooldown = cooldown
self._failures = 0
self._opened_at = None
self._state = “closed” # closed | open | half_open
def allow_request(self) -> bool:
if self._state == “closed”:
return True
if self._state == “open”:
if time.time() – self._opened_at > self.cooldown:
self._state = “half_open”
return True # probe
return False
return True # half_open: allow probe
def record_success(self):
self._failures = 0
self._state = “closed”
def record_failure(self):
self._failures += 1
if self._failures >= self.threshold:
self._state = “open”
self._opened_at = time.time()
“`
New worst case: 45s × 2 retries = 90 seconds, then the circuit opens for 60s. The pipeline continues processing other nodes.
—
## Phase 4 — The Shadow Graph: Pre-Reality Buffer
One of the most interesting architectural pieces we built is the **Shadow Graph**.
The problem: the validator was too strict. Edges with `unknown_src`, `unknown_dst`, or `invalid_kind` were simply dropped. But many of these were *actually valid* — the referenced nodes just hadn’t been materialized into the graph yet.
We were deleting real intelligence because the graph wasn’t ready to receive it.
**The Shadow Graph holds these rejected edges in a probabilistic pre-reality buffer:**
“`python
@dataclass
class ShadowEdge:
src: str
dst: str
kind: str
confidence: float
rejection_reason: str # “unknown_src” | “unknown_dst” | “invalid_kind”
context_node_id: str
created_at: float
ttl: float = 300.0 # 5 minutes
attempt_count: int = 0
“`
After every materialization pass (when new stub nodes are added to the graph), the shadow graph calls `re_evaluate(known_node_ids)` — any edge whose `src` AND `dst` are now in the graph gets **promoted**:
“`
[shadow] PROMOTED session→host edge kind=flow_observed
after 1 attempts, age=2.3s
“`
When you see this in the logs, the system isn’t just processing data.
**It’s beginning to believe its own hypotheses.**
The shadow graph exposes `/api/shadow/summary` and `/api/shadow/edges` so the UI can visualize what the system *suspects but hasn’t confirmed yet*.
—
## Phase 5 — Semantic Edge Repair
Not every invalid edge kind is a hallucination. Sometimes the model produces a semantically correct edge in a non-canonical form.
`FLOW_OBSERVED` → should be `INFERRED_FLOW_IN_SERVICE`
`HOST_OBSERVED` → should be `INFERRED_HOST_ROLE`
`OBSERVED` → should be `INFERRED_NODE_EXISTS`
We handle two layers of repair:
**Layer 1: Static Aliases** (`rule_prompt.py`)
“`python
EDGE_KIND_ALIASES = {
“FLOW_OBSERVED”: “INFERRED_FLOW_IN_SERVICE”,
“FLOW_OBSERVED_PORT”: “INFERRED_FLOW_IN_SERVICE”,
“HOST_OBSERVED”: “INFERRED_HOST_ROLE”,
“HOST_OBSERVED_SERVICE”: “INFERRED_HOST_OFFERS_SERVICE”,
“OBSERVED”: “INFERRED_NODE_EXISTS”,
# … ~40 total aliases
}
“`
**Layer 2: Semantic Repair** (`semantic_edge_repair.py`)
For anything that doesn’t hit a static alias, we compute cosine similarity against EmbeddingGemma embeddings of all known canonical kinds. If similarity ≥ 0.65 (tuned down from 0.82 after live log analysis showed consistent ~0.65 valid repairs being rejected), the kind is repaired:
“`
SemanticEdgeRepair: FLOW_OBSERVED_SERVICE → INFERRED_HOST_OFFERS_SERVICE
cosine_similarity=0.71, threshold=0.65 ✓
“`
—
## Phase 6 — Multi-Instance Orchestrator
A single SCYTHE instance processes one investigation context. We extended this into a full **multi-instance orchestrator**:
“`
ScytheOrchestrator (port 5001)
├── scythe-a43e657c (port 45574) — Investigation A
│ └── HypergraphEngine (own sqlite, own Gemma queue)
├── scythe-b7f21e90 (port 51283) — Investigation B
│ └── HypergraphEngine (own sqlite, own Gemma queue)
└── …
“`
*”Each instance: one sovereign hypergraph. No shared memory. No shared state. No accidental cross-contamination.”*
Each instance registers with the orchestrator, gets assigned a random high port, and maintains its own:
– SQLite hypergraph store
– Gemma inference queue
– Stream relay WebSocket connection
– PCAP/FTP ingest pipeline
The orchestrator provides `/api/scythe/instances`, `/api/scythe/instances/new`, and health endpoints. The UI shows all active instances and allows spawning new ones directly from the command panel.
—
## Phase 7 — The Geolocation Fusion Engine
By checkpoint 19 we had enough RTT and traceroute data flowing to attempt real **timing-based geolocation**:
– **`/api/timing/probe`** — server-side ICMP ping RTT → distance estimate (62.5 km/ms, speed-of-light in fiber)
– **`/api/timing/traceroute`** — real traceroute with per-hop RTT + cumulative km
– **`/api/timing/tdoa`** — multi-observer TDoA (Time Difference of Arrival) trilateration via gradient descent least-squares
“`python
def tdoa_trilateration(observers: list[dict]) -> dict:
“””
Gradient descent least-squares TDoA fix.
observers: [{“lat”: …, “lon”: …, “rtt_ms”: …}, …]
returns: {“lat”: est_lat, “lon”: est_lon, “confidence”: 0-1}
“””
“`
The IPv6 audit that preceded this revealed 302 entities with IPv6 addresses that weren’t being resolved correctly — their geolocation was defaulting to null islands in the Atlantic. Fixed by extending the geolocation fusion pipeline to handle `::ffff:`-prefixed IPv4-mapped IPv6 addresses and dual-stack resolution.
—
## Phase 8 — MIMO Hop Classifier
One of the more technically specific modules: a **MIMO (Multiple-Input Multiple-Output) hop classifier** for analyzing RF signal paths through intermediate relay nodes.
Real RF environments have multi-hop paths. A signal doesn’t go directly from emitter to receiver — it bounces off terrain, refracts through layers, or routes through intentional relay infrastructure. The MIMO classifier:
1. Takes a set of observed signal characteristics (timing, frequency offset, phase coherence)
2. Classifies the likely hop count and relay topology
3. Produces `INFERRED_RF_RELAY_HOP` edges in the hypergraph
This feeds directly into the **geo-stability score** — if an entity’s IP is shifting but its RF fingerprint (latency profile + frequency drift + phase coherence) is stable, the *physical location is anchored* even if the identity layer is churning.
—
## Phase 9 — The Threat Gravity Map 🌌
The most visually dramatic piece we’ve built so far.
The problem with graph visualization at scale: 1,000+ nodes is unreadable as a traditional force-directed graph. We needed a physics metaphor that *reveals structure* rather than obscuring it.
**The insight: treat threat level as mass.**
“`python
def compute_mass(node):
return (
0.30 * log(node.degree + 1) + # connectivity
0.25 * log(node.flow_count + 1) + # traffic volume
0.20 * node.persistence_score + # temporal stability
0.15 * node.anomaly_score + # behavioral anomaly
0.10 * node.shadow_promotion_count # early-indicator weight
)
“`
High-mass (high-threat) nodes *pull nearby nodes toward them*. Clusters form naturally. Infrastructure that coordinates — botnets, C2 beacons, proxy chains — compresses into visible clumps without any rules.
**Frontend: Canvas-based force simulation**
No D3, no WebGL — pure canvas 2D with a custom Verlet integration loop:
“`javascript
function applyThreatGravity(alpha) {
for (let i = 0; i < nodes.length; i++) {
for (let j = i + 1; j < nodes.length; j++) {
const a = nodes[i], b = nodes[j];
if (a.mass < 1 || b.mass < 1) continue;
const dx = b.x – a.x, dy = b.y – a.y;
const dist = Math.sqrt(dx*dx + dy*dy) + 0.1;
const force = (a.mass * b.mass) / (dist * dist) * alpha * 0.3;
a.vx += force * dx / dist;
a.vy += force * dy / dist;
b.vx -= force * dx / dist;
b.vy -= force * dy / dist;
}
}
}
“`
**Visual encoding:**
| Property | Visual |
|———-|——–|
| Node size | Mass (threat weight) |
| Glow intensity | Anomaly score |
| Pulse rate | Temporal activity / recency |
| Purple ring | Shadow promotions pending |
| Color: blue | Low threat (benign) |
| Color: yellow | Uncertain (in shadow graph) |
| Color: red | High-confidence threat |
The panel auto-refreshes every 10 seconds, reheats the simulation on new data, and supports drag, pan (mouse drag on background), and zoom (scroll wheel). Hover tooltip shows ID, kind, mass, anomaly score, and pending shadow promotions.
### The Circular Reference Crash
Before the gravity map was live, we hit a fun one:
“`
ValueError: Circular reference detected
File “rf_scythe_api_server.py”, line 10299, in pcap_session_subgraph
return jsonify({…})
“`
The hypergraph’s `HGNode` and `HGEdge` objects had back-references to each other — `node.edges → edge.nodes → node…`. Python’s `json.dumps` detected the cycle and threw.
The fix: replace `_as_dict` with a cycle-safe serializer:
“`python
def _safe_serial(obj, _seen=None):
if _seen is None:
_seen = set()
oid = id(obj)
if oid in _seen:
return “__circular_ref__”
_seen.add(oid)
try:
if isinstance(obj, dict):
return {k: _safe_serial(v, _seen) for k, v in obj.items()}
elif isinstance(obj, list):
return [_safe_serial(v, _seen) for v in obj]
elif hasattr(obj, ‘__dict__’):
return _safe_serial(vars(obj), _seen)
else:
return obj
finally:
_seen.discard(oid) # allow same object in different branches
“`
The key insight: `_seen.discard(oid)` in `finally` — the same object can appear in *different branches* of the tree legitimately. Only true cycles (where we’d loop forever) get the `__circular_ref__` marker. The UI receives valid JSON with a string sentinel at cycle points.
—
## Where We Are Now
After 23 checkpoints, here’s the current stack:
“`
Pixel 7 Pro
└── ScytheCommandApp (WebView shell)
└── http://192.168.1.185:5001/command-ops-visualization.html
├── Cesium 3D globe (live entity markers)
├── RF Hypergraph visualization
├── 🌌 Threat Gravity Map (force simulation)
├── Recon Panel (1,185+ entities, IPv6-aware)
├── MIMO Hop Classifier panel
├── GraphOps LLM chat (Gemma-backed)
└── COMMAND CONSOLE, OPERATORS, and more
NerfEngine Server (192.168.1.185)
├── Scythe Orchestrator (port 5001)
│ └── Multi-instance manager (dynamic port assignment)
├── rf_scythe_api_server.py (per-instance, dynamic port)
│ ├── HypergraphEngine (SQLite-WAL, incremental)
│ ├── Gemma inference pipeline (circuit breaker, shadow graph)
│ ├── EmbeddingGemma semantic repair
│ ├── FTP/PCAP ingest pipeline
│ ├── Timing geolocation (probe/traceroute/TDoA)
│ ├── /api/gravity/* (Threat Gravity Map backend)
│ └── /api/shadow/* (Shadow Graph inspection)
├── stream_relay.py (WebSocket relay, ws://…:8765/ws)
├── shadow_graph.py (pre-reality edge buffer, 5-min TTL)
└── tak_ml_gemma_runner.py (Gemma + circuit breaker)
“`
—
## What’s Coming
The system is starting to show signs of *autonomous behavior*. The shadow graph is promoting edges. The gravity map is revealing clusters we didn’t know existed. The circuit breaker is keeping the pipeline alive under GPU degradation.
**The next phase is temporal identity threads** — stable entity identities that persist across IP churn, session turnover, and proxy cycling. If a threat actor rotates their infrastructure, their behavioral fingerprint remains. The system will know it’s the same actor even before any rule fires.
After that: **edge confidence pressure** — edges won’t be binary valid/invalid. They’ll have a confidence score that decays over time, builds with corroborating evidence, and promotes automatically when enough sensors agree.
The end state isn’t a tool.
It’s a living graph that notices things before humans know what to look for.
—
## Acknowledgements
Built with:
– [Android Tactical Assault Kit (ATAK CIV)](https://github.com/deptofdefense/AndroidTacticalAssaultKit-CIV) — DoD open-source situational awareness platform
– [Ollama](https://ollama.com) — local LLM inference (Gemma 3, EmbeddingGemma)
– [Cesium](https://cesium.com) — 3D geospatial rendering
– [DuckDB](https://duckdb.org) — in-process analytical SQL
– [Flask](https://flask.palletsprojects.com) + [Socket.IO](https://socket.io) — API server and real-time streams
– Python 3.12, Android API 36, Pixel 7 Pro (Tensor G2)
—
*SCYTHE is a research and development project. All capabilities described operate on self-hosted infrastructure with locally-run models. No data leaves the local network.*
—
**Tags:** `#cybersecurity` `#OSINT` `#SIGINT` `#hypergraph` `#threatintelligence` `#android` `#ATAK` `#machinelearning` `#Gemma` `#RF` `#realtime` `#Python` `#infosec` ‘#ClaudeAI’ ‘#ChatGPT’ ‘#GeminiAI’ ‘#GrokAI’