§ 01 · Read

Product thesis

Why Conversational Factory exists. The egress problem, the one-way architectural commitment, and the user outcomes that follow from it.

View as .md

Problem

A working plant is already full of data — pools of it: Modbus registers, controller context, alarm and batch logs, much of it never surfaced. The ingest data plane collects what matters into a historian inside the trusted network. The hard part is not having the data — it is getting questions answered against it without putting the plant at risk.

The people and the AI that need answers are outside the trusted boundary — by design, and for good reason.
Every outbound path is an inbound risk. A query API, a database port, a VPN — anything that can answer a question from outside is something an attacker can reach the plant through.
Copying data to the cloud or to IT creates a new attack surface that still has a route home. A breach of the copy becomes a breach of the plant if any return path exists.
So the answer goes through a sysadmin in another building — slow, manual, gated — because the secure path and the convenient path have always been the same path, pointed both ways.

The result is a factory whose data is reachable only through people, because every machine path that could reach it could also be reached through.

Thesis

Conversational Factory is the open, local product for secure one-way egress of industrial data — the sovereign, source-available deployment you run on-site. The defensible value is not the collection of data and not the AI on top of it — it is the architecture (Industrial Independence) that lets data leave without letting anything back in. The concept is the moat; the product is one coherent, auditable way to run it. Coming soon.

The commitment is a single structural property: the only thing that ever crosses the boundary is a copy, travelling outward over a transport that has no return path. Everything else follows from that.

In practice:

The authoritative historian stays inside the trusted network and never depends on anything outside it.
A one-way sync mirrors it outward — datagrams out, no acknowledgement, no return socket — optionally enforced by a hardware data diode so the one-wayness is physical, not configured.
The outside copy is expendable: standardized, and worth nothing to an attacker but a copy.
Questions are asked against the copy, never against the plant.

Product definition

Conversational Factory is composed of roles — what must be filled for secure read-only access, not a particular codebase. Reference software exists for some of them, but it is illustrative; the roles and constraints are canonical.

Ingest data plane — taps the plant’s existing pools of data (Modbus registers, controller context, logs) into the standardized record, read-only with no process side effects. Passive (observe only — example: the witness) or active (poll/query/discover, read-only — example: discovery); a factory may run both. The specific feed is site-dependent; the role is not.
Inside historian — the authoritative system of record on the trusted network, in a standardized schema. Self-sufficient; the product begins where the data needs to leave.
One-way sync — a one-way transport (custom UDP, optionally a hardware data diode) that mirrors the inside historian outward with no return path. This is the moat.
Outside historian — a standardized, expendable copy on the outside. Owning it completely yields historical data and no route to anything. Optionally forwarded to a cloud or off-site server over MQTT in realtime.
Query plane — the source-agnostic component that turns natural-language questions into bounded, read-only, audited queries and composes grounded answers. Source placement is its own concern; the copy is just one source. Example: modelpond. Inference is model-agnostic — small models on-prem at the edge, or any frontier model when the site allows it.

What makes it different

This is not another historian, dashboard, or AI wrapper.

A historian replacement competes on storage and query. An AI wrapper competes on the model. Conversational Factory competes on the boundary: the guarantee that no question — and no compromise of the thing answering it — can travel back to the plant. The moat is the secure design itself, and it lives in the transport and the topology, not in policy, configuration, or a firewall rule someone can fat-finger.

It also differs from a flat-network UNS by treating segmentation as a first-class architectural constraint. It delivers the interoperability people want from a UNS without assuming a single shared network plane and without a bidirectional broker at the boundary.

Architectural principles

The dangerous direction doesn’t exist

The seam is a one-way transport: datagrams out, no acknowledgement, no return socket. There is nothing to harden against an inbound exploit because there is no inbound. Optionally a hardware diode makes that physical.

Built to be lost

The outside copy is meant to be lost. Compromise it completely and the blast radius stops at a copy of historical data — no route, no reach, no plant.

The plant never depends on the outside

The inside historian is authoritative and self-sufficient. Egress is additive. Cut the link entirely and the plant is unaffected; the only thing lost is the outside view.

Standardized, not bespoke

The copy speaks a standard schema and read API. Swap clients, models, dashboards, or clouds without touching the boundary. No lock-in on either side of the seam.

Model-agnostic inference

Run specialized small models on-prem at the edge when the site is air-gapped, or point any frontier or preferred model at the larger external dataset when it isn’t. Same data, same surface — only the model placement changes.

Optional reach, not required reach

Forwarding the copy to a cloud or off-site server over MQTT is opt-in. The system is fully useful with zero outbound connectivity. Sovereignty is the default; reach is a choice.

Sovereign per zone

Every zone appliance stands alone and complete for its scope. The inside / one-way / outside pattern is fractal across the plant hierarchy — cell, line, area, site.

User outcomes

Operators can ask direct questions about their equipment without a path back into the plant existing for anyone, including them.
Security teams get an egress they can reason about: one direction, no socket home, a copy that is safe to lose.
Engineers get a queryable operational record reachable from where they actually work — IT, a workstation, the cloud — without flattening the network.
Enterprise consumers get structured OT context through a standard surface without owning the plant-floor integration problem or assuming connectivity.

Non-goals for the first iteration

closed-loop control or any write path to a plant device
a return channel of any kind across the seam
perfect ontology coverage across all industrial vendors
cloud-first assumptions
dependence on a single LLM provider