# Graph RAG Status

Graph RAG is now wired into the deployed Worker as a compact evidence graph.

Current implementation:

- `POST /api/analyze` loads `./data/permit_graph_index.json` from the Pages origin.
- The Worker retrieves matching ECHO, RBLC, TCEQ, and emissions evidence nodes using project text, pollutant cues, program cues, and state cues.
- Retrieved evidence is passed to DeepSeek JSON mode together with the deterministic rule baseline.
- The API response includes `engine`, `retrieval`, and `citations`.
- If DeepSeek fails, the API still returns a graph-grounded deterministic fallback with `engine: "graph_rag_rules"`.

This is not yet a full-scale live Vectorize/D1 knowledge graph. It is a deployed, auditable static evidence graph built from the local processed ECHO/TCEQ samples plus RBLC control-search guidance.

## Evidence Graph

Public graph file:

- `./data/permit_graph_index.json`

Node types:

- `ECHO_AGGREGATE`: processed ECHO bulk-data summaries
- `ECHO_PIPELINE`: CAA pipeline enforcement-pattern summaries
- `ECHO_ANALOG_CASE`: similar local analog cases from processed ECHO outputs
- `STATE_EVENT_SAMPLE`: TCEQ air-emission-event sample evidence
- `RBLC_CONTROL_SEARCH`: RBLC control-technology search plans for permit review
- `EMISSIONS_AGGREGATE`: processed pollutant/emissions crosswalk summaries

Response shape:

```json
{
  "engine": "deepseek_graph_rag",
  "retrieval": {
    "mode": "static_graph_rag",
    "sources": [],
    "nodes": []
  },
  "citations": ["echo_pipeline_voc_titlev"]
}
```

## Local LLM Status

The deployed page does not call the user's local LLM or localhost.

The deployed page calls DeepSeek server-side through the Cloudflare Worker. Secrets and prompts stay outside browser JavaScript.

To use a local model later, add an explicit local backend connector, for example:

- Ollama or LM Studio running on the user's machine.
- A local API proxy that accepts project text and returns normalized scenario JSON.
- CORS configured deliberately for the web origin.

Do not put model keys or private prompts directly in browser JavaScript.

## Next Scale-Up

A full-scale Graph RAG implementation should move the compact JSON graph into durable backend storage.

Recommended backend shape:

- Vectorize: semantic retrieval over permit evidence, control technology, and event narratives.
- D1: graph records for facility, process, pollutant, program, control, event, and permit-condition nodes.
- R2: source files and generated evidence snapshots.
- Worker API: normalized retrieval, citations, model prompting, and deterministic fallback.

The current compact graph keeps the public demo useful while avoiding direct browser secrets and avoiding per-request scraping of heavy government databases.
