How It Works

The features section describes what Persona does. This section describes how it does the parts that aren't obvious.

You don't need to read any of this to use the app. It's here for people who want to understand the mechanism — what's actually running on your machine, why memory holds up over months, why a character's face doesn't drift between photos.

What's covered

Local-First & Privacy — what data lives where, what encryption is in place, and what "local-first" actually rules out.
The Memory System — how raw chat becomes searchable long-term memory, and why the two-tier design matters.
The Media Pipeline — face consistency, the willingness check, and how a photo gets from "I'd love to see where you are" to a generated image.
The Scene System — persistent scene vs live scene state, and why image and video generation reads from the live layer.

A short mental model

A useful sketch of Persona, from top to bottom:

A chat layer where you interact with characters.
A scene layer that tracks where they are, what they're wearing, and what's happening right now.
A memory layer that turns raw conversation into searchable long-term knowledge per character.
A media pipeline that takes the current scene plus a character's reference shots and produces photos, voice, or video.
A storage layer that holds all of it, encrypted, on your machine.
A provider layer where actual inference (LLM, image, video, voice) happens via APIs you've configured.

Every section in this part of the docs zooms in on one of these layers.

How It Works ​

What's covered ​

A short mental model ​

How It Works

What's covered

A short mental model