|
39006
|
1442
|
34
|
2026-05-14T06:32:42.914332+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740362914_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-1771164907786598729
|
8636063624999889877
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39004
|
1442
|
33
|
2026-05-14T06:32:42.412431+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740362412_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates....
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
193450813362110442
|
9212448452481571735
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates....
|
39002
|
NULL
|
NULL
|
NULL
|
|
39002
|
1442
|
32
|
2026-05-14T06:32:41.347997+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740361347_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
8234219729478526510
|
9133637656415718295
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38999
|
1442
|
31
|
2026-05-14T06:32:29.646823+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740349646_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.17278531,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.21428572,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.21628092,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"bounds":{"left":0.0787899,"top":0.21747805,"width":0.23088431,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.10920878,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"bounds":{"left":0.18799867,"top":0.28850758,"width":0.06333112,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.20994017,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"bounds":{"left":0.0787899,"top":0.3499601,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"bounds":{"left":0.0787899,"top":0.35155627,"width":0.12549867,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.072972074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"bounds":{"left":0.15176196,"top":0.37789306,"width":0.07047872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.23321144,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"bounds":{"left":0.08510638,"top":0.39864326,"width":0.032247342,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.11735372,"top":0.39864326,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"bounds":{"left":0.0787899,"top":0.42817238,"width":0.23038563,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.040724736,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.038896278,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.20744681,"height":0.09936153},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"bounds":{"left":0.0787899,"top":0.67318434,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"bounds":{"left":0.0787899,"top":0.67478055,"width":0.08759973,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"bounds":{"left":0.0787899,"top":0.70111734,"width":0.2278923,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"bounds":{"left":0.0787899,"top":0.7721468,"width":0.09275266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"bounds":{"left":0.09142287,"top":0.801676,"width":0.07347074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"bounds":{"left":0.09142287,"top":0.8312051,"width":0.038231384,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"bounds":{"left":0.12965426,"top":0.8312051,"width":0.014960106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.14461437,"top":0.8312051,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"bounds":{"left":0.15242687,"top":0.8312051,"width":0.041888297,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"bounds":{"left":0.19431517,"top":0.8312051,"width":0.02044548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"bounds":{"left":0.09142287,"top":0.8607342,"width":0.030585106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"bounds":{"left":0.12200798,"top":0.8607342,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.17037898,"top":0.8607342,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"bounds":{"left":0.17819148,"top":0.8607342,"width":0.061502658,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"bounds":{"left":0.23969415,"top":0.8607342,"width":0.027260639,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"bounds":{"left":0.09142287,"top":0.8902634,"width":0.20079787,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"bounds":{"left":0.0787899,"top":0.92378294,"width":0.116023935,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.15159574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.23238032,"top":0.94573027,"width":0.064328454,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.234375,"height":0.05546689},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.020777926,"height":-0.015562654},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.027925532,"height":-0.04509175},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.02244016,"height":-0.07462096},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
8637438618246780860
|
9209071893608016861
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro...
|
38998
|
NULL
|
NULL
|
NULL
|
|
38998
|
1442
|
30
|
2026-05-14T06:32:28.732535+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740348732_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
8329327849920350305
|
9207950348869356183
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38996
|
1442
|
29
|
2026-05-14T06:32:26.103740+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740346103_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
5033023110818293923
|
8492090256744131541
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt...
|
38994
|
NULL
|
NULL
|
NULL
|
|
38994
|
1442
|
28
|
2026-05-14T06:32:25.267292+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740345267_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-7560975613982075861
|
8636063616409938909
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38992
|
1442
|
27
|
2026-05-14T06:32:24.274785+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740344274_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-2208571265329611096
|
8635993256255716245
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the...
|
38990
|
NULL
|
NULL
|
NULL
|
|
38990
|
1442
|
26
|
2026-05-14T06:32:23.461893+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740343461_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
....
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
4157554656475024107
|
9207950348869323415
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
....
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38987
|
1442
|
25
|
2026-05-14T06:32:22.891667+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740342891_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.17278531,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
8124591190541635378
|
8487585703635086293
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking...
|
38986
|
NULL
|
NULL
|
NULL
|
|
38986
|
1442
|
24
|
2026-05-14T06:32:21.696973+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740341696_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
6321079244253590506
|
9212448450938133143
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38984
|
1442
|
23
|
2026-05-14T06:32:20.698491+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740340698_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
8965951776467902919
|
8487585703635086293
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said...
|
38982
|
NULL
|
NULL
|
NULL
|
|
38982
|
1442
|
22
|
2026-05-14T06:32:19.454619+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740339454_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
5270636437375854633
|
8487586794555714517
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38980
|
1442
|
21
|
2026-05-14T06:32:04.989795+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740324989_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
6787881506923938373
|
9138142493060291223
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like...
|
38978
|
NULL
|
NULL
|
NULL
|
|
38978
|
1442
|
20
|
2026-05-14T06:32:01.939089+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740321939_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.17278531,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.21428572,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.21628092,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"bounds":{"left":0.0787899,"top":0.21747805,"width":0.23088431,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.10920878,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"bounds":{"left":0.18799867,"top":0.28850758,"width":0.06333112,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.20994017,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"bounds":{"left":0.0787899,"top":0.3499601,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"bounds":{"left":0.0787899,"top":0.35155627,"width":0.12549867,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.072972074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"bounds":{"left":0.15176196,"top":0.37789306,"width":0.07047872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.23321144,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"bounds":{"left":0.08510638,"top":0.39864326,"width":0.032247342,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.11735372,"top":0.39864326,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"bounds":{"left":0.0787899,"top":0.42817238,"width":0.23038563,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.040724736,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.038896278,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.20744681,"height":0.09936153},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"bounds":{"left":0.0787899,"top":0.67318434,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"bounds":{"left":0.0787899,"top":0.67478055,"width":0.08759973,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"bounds":{"left":0.0787899,"top":0.70111734,"width":0.2278923,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"bounds":{"left":0.0787899,"top":0.7721468,"width":0.09275266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"bounds":{"left":0.09142287,"top":0.801676,"width":0.07347074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"bounds":{"left":0.09142287,"top":0.8312051,"width":0.038231384,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"bounds":{"left":0.12965426,"top":0.8312051,"width":0.014960106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.14461437,"top":0.8312051,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"bounds":{"left":0.15242687,"top":0.8312051,"width":0.041888297,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"bounds":{"left":0.19431517,"top":0.8312051,"width":0.02044548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"bounds":{"left":0.09142287,"top":0.8607342,"width":0.030585106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"bounds":{"left":0.12200798,"top":0.8607342,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.17037898,"top":0.8607342,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"bounds":{"left":0.17819148,"top":0.8607342,"width":0.061502658,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"bounds":{"left":0.23969415,"top":0.8607342,"width":0.027260639,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"bounds":{"left":0.09142287,"top":0.8902634,"width":0.20079787,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"bounds":{"left":0.0787899,"top":0.92378294,"width":0.116023935,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.15159574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.23238032,"top":0.94573027,"width":0.064328454,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.234375,"height":0.05546689},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.020777926,"height":-0.015562654},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.027925532,"height":-0.04509175},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.02244016,"height":-0.07462096},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
-8480123855798478377
|
8632611289480981461
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38976
|
1442
|
19
|
2026-05-14T06:31:57.372632+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740317372_m2.jpg...
|
iTerm2
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
selectionViewTerminalWinaowmelt• screenpipe SSH: n selectionViewTerminalWinaowmelt• screenpipe SSH: nas — activity.pvmain.ov Mactivity.pyMXDv90onsumers› activity.py › ...det audio seqments(tor date: date None = None) → List dict str, Any :Start, end = date range(d)interactions.pyapp> DockernleWORKDIR /appQ copyrequirements.txt |RIN nin install --no-cache-dir -r requirements.txCOPY.14 Eypose дagq16 CMD ("uvicorn", "main:app",Dockerfile (Working Tree) Xscreenpipe_sync_files.shWORKDIR /apPcopy requirements.extCOPY.eypose gaddCMD ["uvicorn",Dally - Platorm • In 14m100% 152Inu 14 May 9.31:0/•OKvRefactor Sync Script+ .*•.tt-toolbar < display:flex; gap:10px; align-itens:center: flex-wrap:wrap; }solid var(--border); border-radius:8px; padding:12px 14px; displayThoughtsSELECTudio chunk id = c.idORDER BY t.timestampendfetchallreturn dictr for r in rowsi7) .map(s BThoughts}) -map(a => {8 db solite@ screenpipe.dbDebua Console(abash-aoo +v @ M ã .1^ xAll requested chanaes comnlete• Addedi.ttcaudio-sen.active CSS class with white inset bordeModified <› ttRenderAudioSeaments/) to hiahliaht audio seaments within 5 seconds of current nosition• Audio timeline now shows visual hiahliaht iust like video timelineAudio auto-olav OfF ov derault•Changed _ttAutoPlayAudio from true to false•Updated button text to "*. Auto-Play Audio: OFF"• Modified ‹/› ttShowAudioTranscription() to only play audio when auto-play is enabledAudin romaine dicabled on page refresh and navigationCollansible time rance filter.•Added collapsible " 7 Time Range Filter" section with From/To time inputs•Added ‹› ttToggleTimeFiltero.‹› ttApplyTimeFiltero,‹› ttCleartimeFilter functions• Updated ‹› ttRenderSeqments and‹› ttRenderAudioSeqments to filter segments by time range•Filter shows only segments within the specified time period (e.g.. 9:45 to 10:30)srrenp ce-tep /6 .03 apt-get update s& apt-get install -y -no-instal l-recornands tfmoeo fonte-deiavincore ce e ere renneene nenenaAsk anvthindScreen Reader Optimized Ln 1, Col 1 Spaces: 4 UTF-8 LF Python 3.11.2 64-bit Teams Windsurf - Settings...
|
NULL
|
-5281057676389898944
|
NULL
|
click
|
ocr
|
NULL
|
selectionViewTerminalWinaowmelt• screenpipe SSH: n selectionViewTerminalWinaowmelt• screenpipe SSH: nas — activity.pvmain.ov Mactivity.pyMXDv90onsumers› activity.py › ...det audio seqments(tor date: date None = None) → List dict str, Any :Start, end = date range(d)interactions.pyapp> DockernleWORKDIR /appQ copyrequirements.txt |RIN nin install --no-cache-dir -r requirements.txCOPY.14 Eypose дagq16 CMD ("uvicorn", "main:app",Dockerfile (Working Tree) Xscreenpipe_sync_files.shWORKDIR /apPcopy requirements.extCOPY.eypose gaddCMD ["uvicorn",Dally - Platorm • In 14m100% 152Inu 14 May 9.31:0/•OKvRefactor Sync Script+ .*•.tt-toolbar < display:flex; gap:10px; align-itens:center: flex-wrap:wrap; }solid var(--border); border-radius:8px; padding:12px 14px; displayThoughtsSELECTudio chunk id = c.idORDER BY t.timestampendfetchallreturn dictr for r in rowsi7) .map(s BThoughts}) -map(a => {8 db solite@ screenpipe.dbDebua Console(abash-aoo +v @ M ã .1^ xAll requested chanaes comnlete• Addedi.ttcaudio-sen.active CSS class with white inset bordeModified <› ttRenderAudioSeaments/) to hiahliaht audio seaments within 5 seconds of current nosition• Audio timeline now shows visual hiahliaht iust like video timelineAudio auto-olav OfF ov derault•Changed _ttAutoPlayAudio from true to false•Updated button text to "*. Auto-Play Audio: OFF"• Modified ‹/› ttShowAudioTranscription() to only play audio when auto-play is enabledAudin romaine dicabled on page refresh and navigationCollansible time rance filter.•Added collapsible " 7 Time Range Filter" section with From/To time inputs•Added ‹› ttToggleTimeFiltero.‹› ttApplyTimeFiltero,‹› ttCleartimeFilter functions• Updated ‹› ttRenderSeqments and‹› ttRenderAudioSeqments to filter segments by time range•Filter shows only segments within the specified time period (e.g.. 9:45 to 10:30)srrenp ce-tep /6 .03 apt-get update s& apt-get install -y -no-instal l-recornands tfmoeo fonte-deiavincore ce e ere renneene nenenaAsk anvthindScreen Reader Optimized Ln 1, Col 1 Spaces: 4 UTF-8 LF Python 3.11.2 64-bit Teams Windsurf - Settings...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38974
|
1442
|
18
|
2026-05-14T06:31:54.743082+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740314743_m2.jpg...
|
iTerm2
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
selectionViewTerminalWinaowmelt• screenpipe SSH: n selectionViewTerminalWinaowmelt• screenpipe SSH: nas — activity.pvmain.ov Mactivity.pyMX• screenpipe [SSH: nas)- #recycleV tr appDv"0onsumers› activity.py › ...det audio seqments(tor date: date None = None) → List dict str, Any :Start, end = date range(d)interactions.pyapp> DockernleWORKDIR /appQ copyrequirements.txt |RIN nin install --no-cache-dir -r requirements.txCOPY.14 Eypose дagq16 CMD ("uvicorn", "main:app",Dockerfile (Working Tree) Xscreenpipe_sync_files.shWORKDIR /apPcopy requirements.extCOPY.eypose gaddCMD ["uvicorn",Dally - Platorm • In 14m100% 152Inu 14 May 9:31:04•OKvRefactor Sync Script+0 .*•.tt-toolbar < display:flex; gap:10px; align-itens:center: flex-wrap:wrap; }solid var(--border); border-radius:8px; padding:12px 14px; displayThoughtsSELECTudio chunk id = c.idORDER BY t.timestampend.fetchallreturn dictr for r in rowsi8 db solite@ screenpipe.dbDebua Consolelareenosce-tep 2 6 rus bpt-get update si pt-get dinstall y -no-instel1- recormende ffrpes fonte-deiavircore e arte oeone(abash-aoo +v @ M ã .1^ x7) .map(s BThoughts}) -map(a => {All requested chanaes comnlete• Addedi.ttcaudio-sen.active CSS class with white inset bordeModified <› ttRenderAudioSeaments/) to hiahliaht audio seaments within 5 seconds of current nosition• Audio timeline now shows visual hiahliaht iust like video timelineAudio auto-olav OfF ov derault•Changed _ttAutoPlayAudio from true to false•Updated button text to "*. Auto-Play Audio: OFF*• Modified ‹/› ttShowAudioTranscription() to only play audio when auto-play is enabledAudin romaine dicabled on page refresh and navigationCollansible time rance filter.•Added collapsible " 7 Time Range Filter" section with From/To time inputs•Added ‹› ttToggleTimeFiltero.‹› ttApplyTimeFiltero,‹› ttCleartimeFilter functions• Updated ‹› ttRenderSeqments and‹› ttRenderAudioSeqments to filter segments by time range•Filter shows only segments within the specified time period (e.g.. 9:45 to 10:30)Ask anvthinoScreen Reader Optimized Ln 1, Col 1 Spaces: 4 UTF-8 LF Python 3.11.2 64-bit Teams Windsurf - Settings...
|
NULL
|
-3490150100453159348
|
NULL
|
click
|
ocr
|
NULL
|
selectionViewTerminalWinaowmelt• screenpipe SSH: n selectionViewTerminalWinaowmelt• screenpipe SSH: nas — activity.pvmain.ov Mactivity.pyMX• screenpipe [SSH: nas)- #recycleV tr appDv"0onsumers› activity.py › ...det audio seqments(tor date: date None = None) → List dict str, Any :Start, end = date range(d)interactions.pyapp> DockernleWORKDIR /appQ copyrequirements.txt |RIN nin install --no-cache-dir -r requirements.txCOPY.14 Eypose дagq16 CMD ("uvicorn", "main:app",Dockerfile (Working Tree) Xscreenpipe_sync_files.shWORKDIR /apPcopy requirements.extCOPY.eypose gaddCMD ["uvicorn",Dally - Platorm • In 14m100% 152Inu 14 May 9:31:04•OKvRefactor Sync Script+0 .*•.tt-toolbar < display:flex; gap:10px; align-itens:center: flex-wrap:wrap; }solid var(--border); border-radius:8px; padding:12px 14px; displayThoughtsSELECTudio chunk id = c.idORDER BY t.timestampend.fetchallreturn dictr for r in rowsi8 db solite@ screenpipe.dbDebua Consolelareenosce-tep 2 6 rus bpt-get update si pt-get dinstall y -no-instel1- recormende ffrpes fonte-deiavircore e arte oeone(abash-aoo +v @ M ã .1^ x7) .map(s BThoughts}) -map(a => {All requested chanaes comnlete• Addedi.ttcaudio-sen.active CSS class with white inset bordeModified <› ttRenderAudioSeaments/) to hiahliaht audio seaments within 5 seconds of current nosition• Audio timeline now shows visual hiahliaht iust like video timelineAudio auto-olav OfF ov derault•Changed _ttAutoPlayAudio from true to false•Updated button text to "*. Auto-Play Audio: OFF*• Modified ‹/› ttShowAudioTranscription() to only play audio when auto-play is enabledAudin romaine dicabled on page refresh and navigationCollansible time rance filter.•Added collapsible " 7 Time Range Filter" section with From/To time inputs•Added ‹› ttToggleTimeFiltero.‹› ttApplyTimeFiltero,‹› ttCleartimeFilter functions• Updated ‹› ttRenderSeqments and‹› ttRenderAudioSeqments to filter segments by time range•Filter shows only segments within the specified time period (e.g.. 9:45 to 10:30)Ask anvthinoScreen Reader Optimized Ln 1, Col 1 Spaces: 4 UTF-8 LF Python 3.11.2 64-bit Teams Windsurf - Settings...
|
38971
|
NULL
|
NULL
|
NULL
|
|
38971
|
1442
|
17
|
2026-05-14T06:31:52.911495+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740312911_m2.jpg...
|
iTerm2
|
NULL
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
SelectionViewTerminalWindowmelt• screenpipe SSH: n SelectionViewTerminalWindowmelt• screenpipe SSH: nas — activity.pvmain.ov Mactivity.pyMX• screenpipe [SSH: nas)• _ #recycieV tr appDv"0onsumers› activity.py › ...det audio seqments(tor date: date None = None) → List dict str, Any :Start, end = date range(d)/interactions.pyapp> DockernleWORKDIR /appQ copyrequirements.txt |RIN nin install --no-cache-dir -r requirements.txCOPY.14 SypocE 9a0016 CMD ("uvicorn", "main:app",Dockerfile (Working Tree) Xscreenpipe_sync_files.shWORKDIR /apPcopy requirements.extCOPY.eyposs gan0CMD ["uvicorn",Dally - Platorm • In 14m100% 152Inu 14 May 9:31:04•OKvRefactor Sync Script+ .*•.tt-toolbar < display:flex; gap:10px; align-itens:center: flex-wrap:wrap; }solid var(--border); border-radius:8px; padding:12px 14px; displayThoughtsSELECTРEEEEAENGudio chunk id = c.idORDER BY t.timestampend.fetchallreturn dictr for r in rowsi8 db solite@ screenpipe.dbDebua Consolel(abash-aoo +v @ M ã .1^ x7) .map(s BThoughts}) -map(a => {All requested chanaes comnlete• Addedi.ttcaudio-sen.active CSS class with white inset bordeModified <› ttRenderAudioSeaments/) to hiahliaht audio seaments within 5 seconds of current nosition• Audio timeline now shows visual hiahliaht iust like video timelineAudio auto-olav OfF ov derault•Changed _ttAutoPlayAudio from true to false•Updated button text to "*. Auto-Play Audio: OFF"• Modified ‹/› ttShowAudioTranscription() to only play audio when auto-play is enabledAudin romaine dicabled on page refresh and navigationCollansible time rance filter.•Added collapsible " 7 Time Range Filter" section with From/To time inputs•Added ‹› ttToggleTimeFiltero.‹› ttApplyTimeFiltero,‹› ttCleartimeFilter functions• Updated ‹› ttRenderSeqments and‹› ttRenderAudioSeqments to filter segments by time range•Filter shows only segments within the specified time period (e.g.. 9:45 to 10:30)apt-get update && apt-get install -y -no-install-recommends ffmpeg fonts-dejavu-core && rm -rf /var/lib/apt/lists/*Ask anvthinoScreen Reader Optimized Ln 1, Col 1 Spaces: 4 UTF-8 LF Python 3.11.2 64-bit Teams Windsurf - Settings...
|
NULL
|
2044362066976705635
|
NULL
|
visual_change
|
ocr
|
NULL
|
SelectionViewTerminalWindowmelt• screenpipe SSH: n SelectionViewTerminalWindowmelt• screenpipe SSH: nas — activity.pvmain.ov Mactivity.pyMX• screenpipe [SSH: nas)• _ #recycieV tr appDv"0onsumers› activity.py › ...det audio seqments(tor date: date None = None) → List dict str, Any :Start, end = date range(d)/interactions.pyapp> DockernleWORKDIR /appQ copyrequirements.txt |RIN nin install --no-cache-dir -r requirements.txCOPY.14 SypocE 9a0016 CMD ("uvicorn", "main:app",Dockerfile (Working Tree) Xscreenpipe_sync_files.shWORKDIR /apPcopy requirements.extCOPY.eyposs gan0CMD ["uvicorn",Dally - Platorm • In 14m100% 152Inu 14 May 9:31:04•OKvRefactor Sync Script+ .*•.tt-toolbar < display:flex; gap:10px; align-itens:center: flex-wrap:wrap; }solid var(--border); border-radius:8px; padding:12px 14px; displayThoughtsSELECTРEEEEAENGudio chunk id = c.idORDER BY t.timestampend.fetchallreturn dictr for r in rowsi8 db solite@ screenpipe.dbDebua Consolel(abash-aoo +v @ M ã .1^ x7) .map(s BThoughts}) -map(a => {All requested chanaes comnlete• Addedi.ttcaudio-sen.active CSS class with white inset bordeModified <› ttRenderAudioSeaments/) to hiahliaht audio seaments within 5 seconds of current nosition• Audio timeline now shows visual hiahliaht iust like video timelineAudio auto-olav OfF ov derault•Changed _ttAutoPlayAudio from true to false•Updated button text to "*. Auto-Play Audio: OFF"• Modified ‹/› ttShowAudioTranscription() to only play audio when auto-play is enabledAudin romaine dicabled on page refresh and navigationCollansible time rance filter.•Added collapsible " 7 Time Range Filter" section with From/To time inputs•Added ‹› ttToggleTimeFiltero.‹› ttApplyTimeFiltero,‹› ttCleartimeFilter functions• Updated ‹› ttRenderSeqments and‹› ttRenderAudioSeqments to filter segments by time range•Filter shows only segments within the specified time period (e.g.. 9:45 to 10:30)apt-get update && apt-get install -y -no-install-recommends ffmpeg fonts-dejavu-core && rm -rf /var/lib/apt/lists/*Ask anvthinoScreen Reader Optimized Ln 1, Col 1 Spaces: 4 UTF-8 LF Python 3.11.2 64-bit Teams Windsurf - Settings...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38969
|
1442
|
16
|
2026-05-14T06:31:43.809364+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740303809_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.17278531,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.21428572,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.21628092,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"bounds":{"left":0.0787899,"top":0.21747805,"width":0.23088431,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.10920878,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"bounds":{"left":0.18799867,"top":0.28850758,"width":0.06333112,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.20994017,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"bounds":{"left":0.0787899,"top":0.3499601,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"bounds":{"left":0.0787899,"top":0.35155627,"width":0.12549867,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.072972074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"bounds":{"left":0.15176196,"top":0.37789306,"width":0.07047872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.23321144,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"bounds":{"left":0.08510638,"top":0.39864326,"width":0.032247342,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.11735372,"top":0.39864326,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"bounds":{"left":0.0787899,"top":0.42817238,"width":0.23038563,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.040724736,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.038896278,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.20744681,"height":0.09936153},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"bounds":{"left":0.0787899,"top":0.67318434,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"bounds":{"left":0.0787899,"top":0.67478055,"width":0.08759973,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"bounds":{"left":0.0787899,"top":0.70111734,"width":0.2278923,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"bounds":{"left":0.0787899,"top":0.7721468,"width":0.09275266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"bounds":{"left":0.09142287,"top":0.801676,"width":0.07347074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"bounds":{"left":0.09142287,"top":0.8312051,"width":0.038231384,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"bounds":{"left":0.12965426,"top":0.8312051,"width":0.014960106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.14461437,"top":0.8312051,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"bounds":{"left":0.15242687,"top":0.8312051,"width":0.041888297,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"bounds":{"left":0.19431517,"top":0.8312051,"width":0.02044548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"bounds":{"left":0.09142287,"top":0.8607342,"width":0.030585106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"bounds":{"left":0.12200798,"top":0.8607342,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.17037898,"top":0.8607342,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"bounds":{"left":0.17819148,"top":0.8607342,"width":0.061502658,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"bounds":{"left":0.23969415,"top":0.8607342,"width":0.027260639,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"bounds":{"left":0.09142287,"top":0.8902634,"width":0.20079787,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"bounds":{"left":0.0787899,"top":0.92378294,"width":0.116023935,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.15159574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.23238032,"top":0.94573027,"width":0.064328454,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.234375,"height":0.05546689},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.020777926,"height":-0.015562654},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.027925532,"height":-0.04509175},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.02244016,"height":-0.07462096},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"bounds":{"left":0.29853722,"top":0.867917,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"bounds":{"left":0.30485374,"top":0.8671189,"width":0.013962766,"height":0.033519555},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false}]...
|
-6207903613934946433
|
9209071893599628253
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message...
|
38968
|
NULL
|
NULL
|
NULL
|
|
38968
|
1442
|
15
|
2026-05-14T06:31:40.829370+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740300829_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
SYS: and tozy. · 10:31
Screenpipe — Archive
Scree SYS: and tozy. · 10:31
Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen...
|
[{"role":"AXStaticText","text& [{"role":"AXStaticText","text":"SYS: and tozy. · 10:31","depth":2,"bounds":{"left":0.49833778,"top":0.29848364,"width":0.038397606,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
4308932330345737185
|
8488711466102975445
|
visual_change
|
accessibility
|
NULL
|
SYS: and tozy. · 10:31
Screenpipe — Archive
Scree SYS: and tozy. · 10:31
Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38967
|
1442
|
14
|
2026-05-14T06:31:37.799634+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740297799_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.17278531,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.21428572,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.21628092,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"bounds":{"left":0.0787899,"top":0.21747805,"width":0.23088431,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.10920878,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"bounds":{"left":0.18799867,"top":0.28850758,"width":0.06333112,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.20994017,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"bounds":{"left":0.0787899,"top":0.3499601,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"bounds":{"left":0.0787899,"top":0.35155627,"width":0.12549867,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.072972074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"bounds":{"left":0.15176196,"top":0.37789306,"width":0.07047872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.23321144,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"bounds":{"left":0.08510638,"top":0.39864326,"width":0.032247342,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.11735372,"top":0.39864326,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"bounds":{"left":0.0787899,"top":0.42817238,"width":0.23038563,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.040724736,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.038896278,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.20744681,"height":0.09936153},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"bounds":{"left":0.0787899,"top":0.67318434,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"bounds":{"left":0.0787899,"top":0.67478055,"width":0.08759973,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"bounds":{"left":0.0787899,"top":0.70111734,"width":0.2278923,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"bounds":{"left":0.0787899,"top":0.7721468,"width":0.09275266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"bounds":{"left":0.09142287,"top":0.801676,"width":0.07347074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"bounds":{"left":0.09142287,"top":0.8312051,"width":0.038231384,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"bounds":{"left":0.12965426,"top":0.8312051,"width":0.014960106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.14461437,"top":0.8312051,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"bounds":{"left":0.15242687,"top":0.8312051,"width":0.041888297,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"bounds":{"left":0.19431517,"top":0.8312051,"width":0.02044548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"bounds":{"left":0.09142287,"top":0.8607342,"width":0.030585106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"bounds":{"left":0.12200798,"top":0.8607342,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.17037898,"top":0.8607342,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"bounds":{"left":0.17819148,"top":0.8607342,"width":0.061502658,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"bounds":{"left":0.23969415,"top":0.8607342,"width":0.027260639,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"bounds":{"left":0.09142287,"top":0.8902634,"width":0.20079787,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"bounds":{"left":0.0787899,"top":0.92378294,"width":0.116023935,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.15159574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.23238032,"top":0.94573027,"width":0.064328454,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.234375,"height":0.05546689},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.020777926,"height":-0.015562654},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.027925532,"height":-0.04509175},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.02244016,"height":-0.07462096},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"bounds":{"left":0.29853722,"top":0.867917,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"bounds":{"left":0.30485374,"top":0.8671189,"width":0.013962766,"height":0.033519555},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"bounds":{"left":0.11702128,"top":0.92178774,"width":0.11170213,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"bounds":{"left":0.068484046,"top":0.92098963,"width":0.043218084,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
1146518781610073049
|
9209072041775999965
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window...
|
38966
|
NULL
|
NULL
|
NULL
|
|
38966
|
1442
|
13
|
2026-05-14T06:31:37.255420+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740297255_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
2277958729343415079
|
9212449551993731735
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38964
|
1442
|
12
|
2026-05-14T06:31:34.521389+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740294521_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-2514742890250730547
|
8635993258403199957
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like...
|
38962
|
NULL
|
NULL
|
NULL
|
|
38962
|
1442
|
11
|
2026-05-14T06:31:31.581880+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740291581_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-7474204610702402041
|
8635993256255712215
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38960
|
1442
|
10
|
2026-05-14T06:31:30.019404+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740290019_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
-6122458525774068188
|
9212447352970427287
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options...
|
38957
|
NULL
|
NULL
|
NULL
|
|
38957
|
1442
|
9
|
2026-05-14T06:31:28.857777+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740288857_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
-6377191244807055115
|
9133637656415718295
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38954
|
1442
|
8
|
2026-05-14T06:31:27.892907+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740287892_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-5254147595515345823
|
9212447352970443671
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources...
|
38952
|
NULL
|
NULL
|
NULL
|
|
38952
|
1442
|
7
|
2026-05-14T06:31:27.042946+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740287042_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
1638774131401302373
|
9135889456229403543
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38950
|
1442
|
6
|
2026-05-14T06:31:23.986450+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740283986_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
3573577467480521573
|
8635993256255712215
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them...
|
38948
|
NULL
|
NULL
|
NULL
|
|
38948
|
1442
|
5
|
2026-05-14T06:31:23.361424+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740283361_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
6170238253643292814
|
9133637656415718295
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38946
|
1442
|
4
|
2026-05-14T06:31:22.812856+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740282812_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
-5036377224840580351
|
9212447352970476439
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy...
|
38944
|
NULL
|
NULL
|
NULL
|
|
38944
|
1442
|
3
|
2026-05-14T06:31:22.252059+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740282252_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
9116262651930194198
|
8635993256255712213
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38942
|
1442
|
2
|
2026-05-14T06:31:21.694814+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740281694_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
-4464506435644478620
|
9133637658563201943
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand...
|
38940
|
NULL
|
NULL
|
NULL
|
|
38940
|
1442
|
1
|
2026-05-14T06:31:20.871093+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740280871_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-6561738457798499810
|
9207950348802247319
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38937
|
1442
|
0
|
2026-05-14T06:31:18.959730+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740278959_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
-4464506435644478620
|
9133637658563201943
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand...
|
38935
|
NULL
|
NULL
|
NULL
|
|
39118
|
1441
|
94
|
2026-05-14T06:36:22.981954+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740582981_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first....
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
6180822191495669181
|
8487585695045151701
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first....
|
39116
|
NULL
|
NULL
|
NULL
|
|
39116
|
1441
|
93
|
2026-05-14T06:36:21.032295+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740581032_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
4843790490906410844
|
8487585703635086293
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39113
|
1441
|
92
|
2026-05-14T06:36:18.032041+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740578032_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
8511764460202151016
|
8631700883120991189
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:...
|
39110
|
NULL
|
NULL
|
NULL
|
|
39110
|
1441
|
91
|
2026-05-14T06:36:10.169079+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740570169_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"}]...
|
-6937488583775454825
|
8994143683501809550
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39109
|
1441
|
90
|
2026-05-14T06:36:08.773983+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740568773_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
Screenpipe [archive.db · 3234.2MB]
Screenpipe
[archive.db · 3234.2MB]
Activity
Search
Audio
Work Report
Timetable
AI Summary
Date
13...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Screenpipe [archive.db · 3234.2MB]","depth":7,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Screenpipe","depth":8,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"[archive.db · 3234.2MB]","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Activity","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Search","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Audio","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Work Report","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Timetable","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Summary","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Date","depth":8,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"13","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
1840405176605355464
|
8632612388975815645
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
Screenpipe [archive.db · 3234.2MB]
Screenpipe
[archive.db · 3234.2MB]
Activity
Search
Audio
Work Report
Timetable
AI Summary
Date
13...
|
39107
|
NULL
|
NULL
|
NULL
|
|
39107
|
1441
|
89
|
2026-05-14T06:36:06.593061+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740566593_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
869273819853951980
|
8487585703635086293
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39105
|
1441
|
88
|
2026-05-14T06:36:01.251496+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740561251_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
Screenpipe [archive.db · 3234.2MB]
Screenpipe
[archive.db · 3234.2MB]
Activity
Search
Audio
Work Report
Timetable
AI Summary
Date
13
/
05
/
2026
Calendar
Search across all your screen activity…
Search
AND also
second required term — both must appear in same result (optional)
Source
App
Date
dd
/
mm
/
yyyy
Calendar
(blank = all dates)
From
--
:
--
To
--
:
--
Only apps
any app (blank = all)
▾...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Screenpipe [archive.db · 3234.2MB]","depth":7,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Screenpipe","depth":8,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"[archive.db · 3234.2MB]","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Activity","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Search","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Audio","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Work Report","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Timetable","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Summary","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Date","depth":8,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"13","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":8,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"05","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":8,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2026","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Calendar","depth":8,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXTextField","text":"Search across all your screen activity…","depth":8,"on_screen":true,"help_text":"","role_description":"text field","subrole":"AXUnknown","is_enabled":true,"is_focused":true,"is_selected":false},{"role":"AXButton","text":"Search","depth":8,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"AND also","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXTextField","text":"second required term — both must appear in same result (optional)","depth":8,"on_screen":true,"help_text":"","role_description":"text field","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Source","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"App","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Date","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dd","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"mm","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yyyy","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Calendar","depth":9,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"(blank = all dates)","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"From","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":":","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"To","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":":","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Only apps","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"any app (blank = all)","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"▾","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
4747785593187487772
|
8632611152025234397
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
Screenpipe [archive.db · 3234.2MB]
Screenpipe
[archive.db · 3234.2MB]
Activity
Search
Audio
Work Report
Timetable
AI Summary
Date
13
/
05
/
2026
Calendar
Search across all your screen activity…
Search
AND also
second required term — both must appear in same result (optional)
Source
App
Date
dd
/
mm
/
yyyy
Calendar
(blank = all dates)
From
--
:
--
To
--
:
--
Only apps
any app (blank = all)
▾...
|
39102
|
NULL
|
NULL
|
NULL
|
|
39102
|
1441
|
87
|
2026-05-14T06:36:00.333450+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740560333_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings....
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-8707066515858440220
|
8632826645584702421
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings....
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39100
|
1441
|
86
|
2026-05-14T06:35:59.252350+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740559252_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
-8541557183947622847
|
8487586657116761047
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy...
|
39097
|
NULL
|
NULL
|
NULL
|
|
39097
|
1441
|
85
|
2026-05-14T06:35:57.261992+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740557261_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"}]...
|
-3608598387732941822
|
8487585703634021333
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39096
|
1441
|
84
|
2026-05-14T06:35:56.321703+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740556321_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false}]...
|
4578207016909166423
|
8632611152042027989
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo...
|
39094
|
NULL
|
NULL
|
NULL
|
|
39094
|
1441
|
83
|
2026-05-14T06:35:55.341536+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740555341_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
3935038536316135224
|
8632611152042027997
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39092
|
1441
|
82
|
2026-05-14T06:35:54.210048+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740554210_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-8123936687174540342
|
9209073003857062877
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page...
|
39088
|
NULL
|
NULL
|
NULL
|
|
39088
|
1441
|
81
|
2026-05-14T06:35:51.094319+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740551094_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
6170238253643292814
|
9133637656415718295
|
app_switch
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39087
|
1441
|
80
|
2026-05-14T06:35:40.168565+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740540168_m1.jpg...
|
iTerm2
|
ec2-user@ip-10-30-129-190:~
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
rsync MacBook Pro Microphone (input)_2026-05-12_11 rsync MacBook Pro Microphone (input)_2026-05-12_11-31-41.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_11-32-11.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_11-32-41.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_11-33-10.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_11-33-40.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_11-34-10.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_11-34-40.mp4 → NAS ✓ 201K
rsync MacBook Pro Microphone (input)_2026-05-12_11-35-10.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_11-35-40.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-36-09.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_11-36-39.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-37-09.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-37-39.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-38-09.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-38-39.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-39-09.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_11-39-39.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-40-09.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-40-39.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-41-09.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_11-41-39.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-42-09.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-42-38.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-43-08.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_11-43-38.mp4 → NAS ✓ 219K
rsync MacBook Pro Microphone (input)_2026-05-12_11-44-08.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_11-44-37.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_11-45-07.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_11-45-36.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_11-46-05.mp4 → NAS ✓ 209K
rsync MacBook Pro Microphone (input)_2026-05-12_11-46-35.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-47-05.mp4 → NAS ✓ 220K
rsync MacBook Pro Microphone (input)_2026-05-12_11-47-34.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_11-48-03.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-48-33.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_11-49-03.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-49-33.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-50-03.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_11-50-33.mp4 → NAS ✓ 210K
rsync MacBook Pro Microphone (input)_2026-05-12_11-51-02.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_11-51-32.mp4 → NAS ✓ 209K
rsync MacBook Pro Microphone (input)_2026-05-12_11-52-01.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_11-52-31.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-53-01.mp4 → NAS ✓ 210K
rsync MacBook Pro Microphone (input)_2026-05-12_11-53-30.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_11-54-00.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_11-54-30.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_11-55-00.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-55-29.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_11-55-59.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-56-29.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_11-56-59.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_11-57-29.mp4 → NAS ✓ 220K
rsync MacBook Pro Microphone (input)_2026-05-12_11-57-59.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_11-58-28.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_11-58-58.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_11-59-28.mp4 → NAS ✓ 218K
rsync MacBook Pro Microphone (input)_2026-05-12_11-59-58.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_12-00-27.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_12-00-57.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_12-01-27.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_12-01-57.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_12-02-27.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_12-02-57.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_12-03-27.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_12-03-57.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_12-04-27.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_12-04-57.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_12-05-26.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_12-05-56.mp4 → NAS ✓ 226K
rsync MacBook Pro Microphone (input)_2026-05-12_12-06-26.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_12-06-56.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_12-07-26.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_12-07-56.mp4 → NAS ✓ 197K
rsync MacBook Pro Microphone (input)_2026-05-12_12-08-26.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_12-08-55.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_12-09-25.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_12-09-55.mp4 → NAS ✓ 197K
rsync MacBook Pro Microphone (input)_2026-05-12_12-10-25.mp4 → NAS ✓ 197K
rsync MacBook Pro Microphone (input)_2026-05-12_12-10-55.mp4 → NAS ✓ 197K
rsync MacBook Pro Microphone (input)_2026-05-12_12-11-25.mp4 → NAS ✓ 210K
rsync MacBook Pro Microphone (input)_2026-05-12_12-11-55.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_12-12-24.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_12-12-54.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_12-13-24.mp4 → NAS ✓ 201K
rsync MacBook Pro Microphone (input)_2026-05-12_12-13-53.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_12-14-23.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_12-14-53.mp4 → NAS ✓ 220K
rsync MacBook Pro Microphone (input)_2026-05-12_12-15-23.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_12-15-53.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_12-16-23.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_12-16-53.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_12-17-53.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_12-18-22.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_12-18-52.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_12-19-22.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_12-19-52.mp4 → NAS ✓ 227K
rsync MacBook Pro Microphone (input)_2026-05-12_12-20-21.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_12-20-51.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_12-21-21.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_12-21-51.mp4 → NAS ✓ 210K
rsync MacBook Pro Microphone (input)_2026-05-12_12-22-21.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_12-22-51.mp4 → NAS ✓ 242K
rsync MacBook Pro Microphone (input)_2026-05-12_12-59-11.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_12-59-41.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_13-00-11.mp4 → NAS ✓ 194K
rsync MacBook Pro Microphone (input)_2026-05-12_13-35-39.mp4 → NAS ✓ 208K
rsync LakyLak bose qc35 II (input)_2026-05-12_14-08-11.mp4 → NAS ✓ 217K
rsync LakyLak bose qc35 II (input)_2026-05-12_14-10-40.mp4 → NAS ✓ 207K
rsync LakyLak bose qc35 II (input)_2026-05-12_14-15-09.mp4 → NAS ✓ 198K
rsync LakyLak bose qc35 II (input)_2026-05-12_14-21-38.mp4 → NAS ✓ 221K
rsync LakyLak bose qc35 II (input)_2026-05-12_14-22-38.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_17-21-56.mp4 → NAS ✓ 195K
rsync System Audio (output)_2026-05-12_17-21-56.mp4 → NAS ✓ 5.0K
rsync System Audio (output)_2026-05-12_17-22-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-22-28.mp4 → NAS ✓ 195K
rsync System Audio (output)_2026-05-12_17-22-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-22-58.mp4 → NAS ✓ 196K
rsync System Audio (output)_2026-05-12_17-23-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-23-28.mp4 → NAS ✓ 200K
rsync System Audio (output)_2026-05-12_17-23-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-23-58.mp4 → NAS ✓ 196K
rsync System Audio (output)_2026-05-12_17-24-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-24-28.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_17-24-58.mp4 → NAS ✓ 199K
rsync System Audio (output)_2026-05-12_17-24-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-25-28.mp4 → NAS ✓ 195K
rsync System Audio (output)_2026-05-12_17-25-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-25-58.mp4 → NAS ✓ 207K
rsync System Audio (output)_2026-05-12_17-25-58.mp4 → NAS ✓ 5.0K
rsync System Audio (output)_2026-05-12_17-26-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-26-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-26-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-26-58.mp4 → NAS ✓ 206K
rsync System Audio (output)_2026-05-12_17-27-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-27-28.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-27-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-27-58.mp4 → NAS ✓ 194K
rsync System Audio (output)_2026-05-12_17-28-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-28-28.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-12_17-28-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-28-58.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-29-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-29-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-29-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-29-58.mp4 → NAS ✓ 203K
rsync System Audio (output)_2026-05-12_17-30-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-30-28.mp4 → NAS ✓ 196K
rsync System Audio (output)_2026-05-12_17-30-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-30-58.mp4 → NAS ✓ 212K
rsync System Audio (output)_2026-05-12_17-31-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-31-28.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-31-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-31-58.mp4 → NAS ✓ 199K
rsync System Audio (output)_2026-05-12_17-32-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-32-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-32-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-32-58.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-12_17-33-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-33-28.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-33-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-33-58.mp4 → NAS ✓ 204K
rsync System Audio (output)_2026-05-12_17-34-28.mp4 → NAS ✓ 8.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-34-28.mp4 → NAS ✓ 207K
rsync System Audio (output)_2026-05-12_17-34-58.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-34-58.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-12_17-35-28.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-35-28.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-35-58.mp4 → NAS ✓ 8.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-35-58.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-12_17-36-28.mp4 → NAS ✓ 15K
rsync MacBook Pro Microphone (input)_2026-05-12_17-36-28.mp4 → NAS ✓ 211K
rsync System Audio (output)_2026-05-12_17-36-58.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-36-58.mp4 → NAS ✓ 213K
rsync System Audio (output)_2026-05-12_17-37-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-37-28.mp4 → NAS ✓ 209K
rsync System Audio (output)_2026-05-12_17-37-58.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-37-58.mp4 → NAS ✓ 206K
rsync System Audio (output)_2026-05-12_17-38-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-38-28.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-12_17-38-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-38-58.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-39-28.mp4 → NAS ✓ 12K
rsync MacBook Pro Microphone (input)_2026-05-12_17-39-28.mp4 → NAS ✓ 215K
rsync System Audio (output)_2026-05-12_17-39-58.mp4 → NAS ✓ 7.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-39-58.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-12_17-40-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-40-28.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-40-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-40-58.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-12_17-41-28.mp4 → NAS ✓ 9.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-41-28.mp4 → NAS ✓ 215K
rsync System Audio (output)_2026-05-12_17-41-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-41-58.mp4 → NAS ✓ 211K
rsync System Audio (output)_2026-05-12_17-42-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-42-28.mp4 → NAS ✓ 203K
rsync System Audio (output)_2026-05-12_17-42-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-42-58.mp4 → NAS ✓ 207K
rsync System Audio (output)_2026-05-12_17-43-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-43-28.mp4 → NAS ✓ 214K
rsync System Audio (output)_2026-05-12_17-43-58.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-43-58.mp4 → NAS ✓ 211K
rsync System Audio (output)_2026-05-12_17-44-28.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-44-28.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-44-58.mp4 → NAS ✓ 7.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-44-58.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-45-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-45-28.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-45-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-45-58.mp4 → NAS ✓ 200K
rsync System Audio (output)_2026-05-12_17-46-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-46-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-46-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-46-58.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-47-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-47-28.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-12_17-47-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-47-58.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-48-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-48-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-48-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-48-58.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-49-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-49-28.mp4 → NAS ✓ 211K
rsync System Audio (output)_2026-05-12_17-49-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-49-58.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-12_17-50-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-50-28.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-50-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-50-58.mp4 → NAS ✓ 204K
rsync System Audio (output)_2026-05-12_17-51-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-51-28.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-51-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-51-58.mp4 → NAS ✓ 204K
rsync System Audio (output)_2026-05-12_17-52-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-52-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-52-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-52-58.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-53-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-53-28.mp4 → NAS ✓ 213K
rsync System Audio (output)_2026-05-12_17-53-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-53-58.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-54-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-54-28.mp4 → NAS ✓ 219K
rsync System Audio (output)_2026-05-12_17-54-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-54-58.mp4 → NAS ✓ 211K
rsync System Audio (output)_2026-05-12_17-55-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-55-28.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-55-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-55-58.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-56-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-56-28.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_17-58-58.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_17-59-28.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_18-00-28.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_18-00-58.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_18-01-58.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_18-04-58.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_18-08-27.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_18-09-27.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_18-09-57.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_18-10-27.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-10-57.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_18-11-27.mp4 → NAS ✓ 218K
rsync MacBook Pro Microphone (input)_2026-05-12_18-11-57.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_18-12-57.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_18-14-57.mp4 → NAS ✓ 228K
rsync MacBook Pro Microphone (input)_2026-05-12_18-15-27.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_18-15-57.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_18-16-57.mp4 → NAS ✓ 221K
rsync MacBook Pro Microphone (input)_2026-05-12_18-19-27.mp4 → NAS ✓ 215K
rsync MacBook Pro Microphone (input)_2026-05-12_18-19-57.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_18-20-27.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_18-20-56.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_18-21-26.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_18-21-56.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_18-22-26.mp4 → NAS ✓ 209K
rsync MacBook Pro Microphone (input)_2026-05-12_18-22-56.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_18-23-26.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_18-23-56.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_18-24-26.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_18-24-56.mp4 → NAS ✓ 217K
rsync MacBook Pro Microphone (input)_2026-05-12_18-25-56.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_18-26-26.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-26-56.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_18-28-26.mp4 → NAS ✓ 216K
rsync MacBook Pro Microphone (input)_2026-05-12_18-28-56.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_18-29-56.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-30-56.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-31-56.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-32-26.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-32-56.mp4 → NAS ✓ 201K
rsync MacBook Pro Microphone (input)_2026-05-12_18-33-56.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-34-56.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-35-26.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-35-56.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_18-36-26.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_18-36-56.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_18-37-26.mp4 → NAS ✓ 201K
rsync MacBook Pro Microphone (input)_2026-05-12_18-37-56.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-38-26.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-38-56.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_18-39-26.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_18-39-56.mp4 → NAS ✓ 195K
rsync MacBook Pro Microphone (input)_2026-05-12_18-40-26.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-40-56.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_18-41-26.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_18-42-26.mp4 → NAS ✓ 209K
rsync MacBook Pro Microphone (input)_2026-05-12_18-43-26.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_18-44-26.mp4 → NAS ✓ 210K
rsync MacBook Pro Microphone (input)_2026-05-12_18-45-26.mp4 → NAS ✓ 207K
audio files total: 1113 file(s), 145M
[+09m26s] ▶ Copying screenpipe logs for 2026-05-12
rsync logs → NAS ✓ 1 file(s), 288K
[2026-05-13 21:46:00] Archive DB size: 2.0G
[2026-05-13 21:46:00] Total time: 9m26s
[2026-05-13 21:46:00] Sync complete for 2026-05-12
[2026-05-13 21:46:00] ========================================
lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13
[2026-05-14 09:28:19] ========================================
[2026-05-14 09:28:19] Screenpipe sync starting for: 2026-05-13
[2026-05-14 09:28:19] ========================================
[+00m00s] ▶ Preflight checks
Source DB: OK (5.6G)
[2026-05-14 09:28:19] ERROR: NAS not mounted at /Volumes/screenpipe
lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13
[2026-05-14 09:28:31] ========================================
[2026-05-14 09:28:31] Screenpipe sync starting for: 2026-05-13
[2026-05-14 09:28:31] ========================================
[+00m00s] ▶ Preflight checks
Source DB: OK (5.6G)
NAS mount: OK /Volumes/screenpipe
Archive DB: exists (2.0G)
Data dir: OK (263 files, 541M)
[+00m04s] ▶ Counting source rows for 2026-05-13
frames: 9586
elements: 1272090
ui_events: 9151
ocr_text: 2829
meetings: 0
audio_chunks: 1295
audio_transcriptions: 102
[+00m06s] ▶ Initialising tables, indexes, FTS
creating tables ✓ 0m00s
creating indexes ✓ 0m00s
creating FTS tables ✓ 0m00s
[+00m06s] ▶ Syncing vision data for 2026-05-13
video_chunks ✓ 0m01s
frames (9586 rows) ✓ 2m16s
ocr_text (2829 rows) ✓ 1m22s
ui_events (9151 rows) ✓ 0m01s
elements (1272090 rows) ✓ 1m26s
meetings (0 rows) ⠋ Runtime error near line 2: database is locked (5)
Parse error near line 3: no such table: nas.meetings
Runtime error near line 5: no such database: nas
lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13
[2026-05-14 09:34:11] ========================================
[2026-05-14 09:34:11] Screenpipe sync starting for: 2026-05-13
[2026-05-14 09:34:11] ========================================
[+00m00s] ▶ Preflight checks
Source DB: OK (5.6G)
NAS mount: OK /Volumes/screenpipe
[2026-05-14 09:34:11] Date 2026-05-13 already has 9586 frames in archive — skipping DB sync
Data dir: OK (263 files, 541M)
[+00m00s] ▶ Copying data folder for 2026-05-13
rsync 2026-05-13/ → NAS ✓ 0m34s (263 files, 524M)
[+00m34s] ▶ Copying audio files for 2026-05-13
rsync System Audio (output)_2026-05-13_06-16-53.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-16-53.mp4 → NAS ✓ 188K
rsync System Audio (output)_2026-05-13_06-17-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-17-25.mp4 → NAS ✓ 190K
rsync System Audio (output)_2026-05-13_06-17-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-17-55.mp4 → NAS ✓ 192K
rsync System Audio (output)_2026-05-13_06-18-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-18-25.mp4 → NAS ✓ 191K
rsync System Audio (output)_2026-05-13_06-18-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-18-55.mp4 → NAS ✓ 191K
rsync System Audio (output)_2026-05-13_06-19-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-19-25.mp4 → NAS ✓ 197K
rsync System Audio (output)_2026-05-13_06-19-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-19-55.mp4 → NAS ✓ 194K
rsync System Audio (output)_2026-05-13_06-20-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-20-25.mp4 → NAS ✓ 195K
rsync MacBook Pro Microphone (input)_2026-05-13_06-20-55.mp4 → NAS ✓ 187K
rsync System Audio (output)_2026-05-13_06-20-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-21-25.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-13_06-21-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-21-55.mp4 → NAS ✓ 190K
rsync System Audio (output)_2026-05-13_06-21-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-22-25.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-13_06-22-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-22-55.mp4 → NAS ✓ 192K
rsync System Audio (output)_2026-05-13_06-22-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-23-25.mp4 → NAS ✓ 198K
rsync System Audio (output)_2026-05-13_06-23-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-23-55.mp4 → NAS ✓ 199K
rsync System Audio (output)_2026-05-13_06-23-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-24-25.mp4 → NAS ✓ 195K
rsync System Audio (output)_2026-05-13_06-24-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-24-55.mp4 → NAS ✓ 209K
rsync System Audio (output)_2026-05-13_06-24-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-25-25.mp4 → NAS ✓ 203K
rsync System Audio (output)_2026-05-13_06-25-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-25-55.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-13_06-25-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-26-25.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-13_06-26-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-26-55.mp4 → NAS ✓ 198K
rsync System Audio (output)_2026-05-13_06-26-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-27-25.mp4 → NAS ✓ 199K
rsync System Audio (output)_2026-05-13_06-27-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-27-55.mp4 → NAS ✓ 200K
rsync System Audio (output)_2026-05-13_06-27-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-28-25.mp4 → NAS ✓ 200K
rsync System Audio (output)_2026-05-13_06-28-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-28-55.mp4 → NAS ✓ 213K
rsync System Audio (output)_2026-05-13_06-28-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-29-25.mp4 → NAS ✓ 233K
rsync System Audio (output)_2026-05-13_06-29-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-29-55.mp4 → NAS ✓ 215K
rsync System Audio (output)_2026-05-13_06-29-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-30-24.mp4 → NAS ✓ 197K
rsync System Audio (output)_2026-05-13_06-30-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-30-54.mp4 → NAS ✓ 195K
rsync System Audio (output)_2026-05-13_06-30-54.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-31-24.mp4 → NAS ✓ 198K
rsync System Audio (output)_2026-05-13_06-31-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-31-54.mp4 → NAS ✓ 192K
rsync System Audio (output)_2026-05-13_06-31-54.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-32-24.mp4 → NAS ✓ 197K
rsync System Audio (output)_2026-05-13_06-32-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-32-54.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-13_06-32-54.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-33-24.mp4 → NAS ✓ 199K
rsync System Audio (output)_2026-05-13_06-33-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-33-54.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-13_06-33-54.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-34-24.mp4 → NAS ✓ 188K
rsync System Audio (output)_2026-05-13_06-34-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-34-54.mp4 → NAS ✓ 184K
rsync System Audio (output)_2026-05-13_06-34-54.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-35-24.mp4 → NAS ✓ 192K
rsync System Audio (output)_2026-05-13_06-35-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-35-54.mp4 → NAS ✓ 200K
rsync System Audio (output)_2026-05-13_06-35-54.mp4 → NAS ✓ 5.0K
rsync System Audio (output)_2026-05-13_06-36-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-36-24.mp4 → NAS ✓ 189K
rsync MacBook Pro Microphone (input)_2026-05-13_06-36-54.mp4 → NAS ✓ 120K
rsync MacBook Pro Microphone (input)_2026-05-13_06-37-12.mp4 → NAS ✓ 16K
rsync System Audio (output)_2026-05-13_06-36-54.mp4 → NAS ✓ 24K
rsync soundcore AeroClip (input)_2026-05-13_06-37-12.mp4 → NAS ✓ 103K
rsync System Audio (output)_2026-05-13_06-37-23.mp4 → NAS ✓ 191K
rsync soundcore AeroClip (input)_2026-05-13_06-37-44.mp4 → NAS ✓ 64K
rsync System Audio (output)_2026-05-13_06-37-53.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-38-14.mp4 → NAS ✓ 86K
rsync System Audio (output)_2026-05-13_06-38-23.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-38-44.mp4 → NAS ✓ 23K
rsync System Audio (output)_2026-05-13_06-38-53.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-39-14.mp4 → NAS ✓ 86K
rsync System Audio (output)_2026-05-13_06-39-23.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-39-44.mp4 → NAS ✓ 82K
rsync System Audio (output)_2026-05-13_06-39-53.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-40-14.mp4 → NAS ✓ 68K
rsync System Audio (output)_2026-05-13_06-40-23.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-40-44.mp4 → NAS ✓ 68K
rsync System Audio (output)_2026-05-13_06-40-53.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-41-14.mp4 → NAS ✓ 171K
rsync System Audio (output)_2026-05-13_06-41-23.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-41-44.mp4 → NAS ✓ 67K
rsync System Audio (output)_2026-05-13_06-41-53.mp4 → NAS ✓ 16K
rsync soundcore AeroClip (input)_2026-05-13_06-42-14.mp4 → NAS ✓ 14K
rsync System Audio (output)_2026-05-13_06-42-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-42-44.mp4 → NAS ✓ 19K
rsync System Audio (output)_2026-05-13_06-42-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-43-14.mp4 → NAS ✓ 93K
rsync System Audio (output)_2026-05-13_06-43-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-43-44.mp4 → NAS ✓ 157K
rsync System Audio (output)_2026-05-13_06-43-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-44-14.mp4 → NAS ✓ 128K
rsync System Audio (output)_2026-05-13_06-44-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-44-44.mp4 → NAS ✓ 134K
rsync System Audio (output)_2026-05-13_06-44-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-45-14.mp4 → NAS ✓ 113K
rsync System Audio (output)_2026-05-13_06-45-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-45-44.mp4 → NAS ✓ 113K
rsync System Audio (output)_2026-05-13_06-45-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-46-14.mp4 → NAS ✓ 148K
rsync System Audio (output)_2026-05-13_06-46-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-46-44.mp4 → NAS ✓ 75K
rsync System Audio (output)_2026-05-13_06-46-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-47-14.mp4 → NAS ✓ 10K
rsync System Audio (output)_2026-05-13_06-47-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-47-44.mp4 → NAS ✓ 31K
rsync System Audio (output)_2026-05-13_06-47-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-48-14.mp4 → NAS ✓ 16K
rsync System Audio (output)_2026-05-13_06-48-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-48-44.mp4 → NAS ✓ 24K
rsync System Audio (output)_2026-05-13_06-48-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-49-14.mp4 → NAS ✓ 11K
rsync System Audio (output)_2026-05-13_06-49-22.mp4 → NAS ✓ 5.0K
rsync System Audio (output)_2026-05-13_06-49-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-49-51.mp4 → NAS ✓ 10K
rsync System Audio (output)_2026-05-13_06-50-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-50-23.mp4 → NAS ✓ 69K
rsync System Audio (output)_2026-05-13_06-50-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-50-53.mp4 → NAS ✓ 67K
rsync System Audio (output)_2026-05-13_06-51-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-51-23.mp4 → NAS ✓ 23K
rsync System Audio (output)_2026-05-13_06-51-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-51-54.mp4 → NAS ✓ 63K
rsync MacBook Pro Microphone (input)_2026-05-13_06-52-42.mp4 → NAS ✓ 27K
rsync MacBook Pro Microphone (input)_2026-05-13_06-52-48.mp4 → NAS ✓ 15K
rsync System Audio (output)_2026-05-13_06-52-22.mp4 → NAS ✓ 5.0K
rsync System Audio (output)_2026-05-13_06-52-52.mp4 → NAS ✓ 5.0K
rsync LakyLak bose qc35 II (input)_2026-05-13_06-52-58.mp4 → NAS ✓ 183K
rsync System Audio (output)_2026-05-13_06-53-22.mp4 → NAS ✓ 5.0K
rsync LakyLak bose qc35 II (input)_2026-05-13_06-53-30.mp4 → NAS ✓ 191K
rsync System Audio (output)_2026-05-13_06-53-52.mp4 → NAS ✓ 5.0K
rsync LakyLak bose qc35 II (input)_2026-05-13_06-54-00.mp4 → NAS ✓ 191K
rsync System Audio (output)_2026-05-13_06-54-22.mp4 → NAS ✓ 5.0K
rsync LakyLak bose qc35 II (input)_2026-05-13_06-54-30.mp4 → NAS ✓ 187K
rsync System Audio (output)_2026-05-13_06-54-52.mp4 → NAS ✓ 5.0K
rsync LakyLak bose qc35 II (input)_2026-05-13_06-55-00.mp4 → NAS ✓ 188K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-00-33.mp4 → NAS ✓ 215K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-01-03.mp4 → NAS ✓ 212K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-01-33.mp4 → NAS ✓ 202K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-02-03.mp4 → NAS ✓ 220K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-02-33.mp4 → NAS ✓ 196K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-03-03.mp4 → NAS ✓ 196K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-03-33.mp4 → NAS ✓ 199K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-04-03.mp4 → NAS ✓ 201K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-04-33.mp4 → NAS ✓ 198K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-05-03.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-05-33.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-06-03.mp4 → NAS ✓ 198K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-06-33.mp4 → NAS ✓ 194K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-07-03.mp4 → NAS ✓ 196K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-07-33.mp4 → NAS ✓ 200K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-08-03.mp4 → NAS ✓ 203K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-08-33.mp4 → NAS ✓ 214K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-09-03.mp4 → NAS ✓ 216K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-09-33.mp4 → NAS ✓ 196K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-10-03.mp4 → NAS ✓ 194K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-10-33.mp4 → NAS ✓ 198K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-11-03.mp4 → NAS ✓ 200K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-11-33.mp4 → NAS ✓ 237K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-12-03.mp4 → NAS ✓ 227K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-12-33.mp4 → NAS ✓ 225K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-13-03.mp4 → NAS ✓ 217K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-13-33.mp4 → NAS ✓ 204K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-14-03.mp4 → NAS ✓ 202K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-14-33.mp4 → NAS ✓ 204K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-15-03.mp4 → NAS ✓ 203K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-15-33.mp4 → NAS ✓ 207K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-16-03.mp4 → NAS ✓ 203K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-16-33.mp4 → NAS ✓ 202K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-17-03.mp4 → NAS ✓ 199K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-17-33.mp4 → NAS ✓ 207K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-18-03.mp4 → NAS ✓ 203K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-18-33.mp4 → NAS ✓ 202K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-19-03.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-19-33.mp4 → NAS ✓ 204K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-20-03.mp4 → NAS ✓ 200K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-20-33.mp4 → NAS ✓ 195K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-21-03.mp4 → NAS ✓ 196K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-21-33.mp4 → NAS ✓ 201K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-22-03.mp4 → NAS ✓ 214K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-25-32.mp4 → NAS ✓ 220K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-27-32.mp4 → NAS ✓ 239K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-28-02.mp4 → NAS ✓ 212K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-28-32.mp4 → NAS ✓ 213K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-29-02.mp4 → NAS ✓ 227K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-29-32.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-30-02.mp4 → NAS ✓ 197K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-30-32.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-31-02.mp4 → NAS ✓ 213K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-31-32.mp4 → NAS ✓ 204K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-32-02.mp4 → NAS ✓ 203K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-32-32.mp4 → NAS ✓ 211K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-33-02.mp4 → NAS ✓ 206K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-33-32.mp4 → NAS ✓ 211K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-34-02.mp4 → NAS ✓ 208K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-34-32.mp4 → NAS ✓ 210K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-35-02.mp4 → NAS ✓ 208K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-35-32.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-36-02.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-36-32.mp4 → NAS ✓ 216K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-37-02.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-37-32.mp4 → NAS ✓ 207K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-38-02.mp4 → NAS ✓ 206K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-38-32.mp4 → NAS ✓ 226K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-39-02.mp4 → NAS ✓ 206K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-39-32.mp4 → NAS ✓ 214K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-40-02.mp4 → NAS ✓ 212K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-40-32.mp4 → NAS ✓ 214K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-41-02.mp4 → NAS ✓ 214K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-41-32.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-42-02.mp4 → NAS ✓ 201K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-42-32.mp4 → NAS ✓ 207K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-43-02.mp4 → NAS ✓ 210K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-53-01.mp4 → NAS ✓ 223K
rsync MacBook Pro Microphone (input)_2026-05-13_07-53-37.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-13_07-54-09.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-13_07-54-39.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-13_07-55-09.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-13_07-55-39.mp4 → NAS ✓ 202K
...
|
[{"role":"AXTextArea","text [{"role":"AXTextArea","text":"rsync MacBook Pro Microphone (input)_2026-05-12_11-31-41.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-32-11.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-32-41.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-33-10.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-33-40.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-34-10.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-34-40.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-35-10.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-35-40.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-36-09.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-36-39.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-37-09.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-37-39.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-38-09.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-38-39.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-39-09.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-39-39.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-40-09.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-40-39.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-41-09.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-41-39.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-42-09.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-42-38.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-43-08.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-43-38.mp4 → NAS ✓ 219K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-44-08.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-44-37.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-45-07.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-45-36.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-46-05.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-46-35.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-47-05.mp4 → NAS ✓ 220K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-47-34.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-48-03.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-48-33.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-49-03.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-49-33.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-50-03.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-50-33.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-51-02.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-51-32.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-52-01.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-52-31.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-53-01.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-53-30.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-54-00.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-54-30.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-55-00.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-55-29.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-55-59.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-56-29.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-56-59.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-57-29.mp4 → NAS ✓ 220K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-57-59.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-58-28.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-58-58.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-59-28.mp4 → NAS ✓ 218K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-59-58.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-00-27.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-00-57.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-01-27.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-01-57.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-02-27.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-02-57.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-03-27.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-03-57.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-04-27.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-04-57.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-05-26.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-05-56.mp4 → NAS ✓ 226K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-06-26.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-06-56.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-07-26.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-07-56.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-08-26.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-08-55.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-09-25.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-09-55.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-10-25.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-10-55.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-11-25.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-11-55.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-12-24.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-12-54.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-13-24.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-13-53.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-14-23.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-14-53.mp4 → NAS ✓ 220K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-15-23.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-15-53.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-16-23.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-16-53.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-17-53.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-18-22.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-18-52.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-19-22.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-19-52.mp4 → NAS ✓ 227K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-20-21.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-20-51.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-21-21.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-21-51.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-22-21.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-22-51.mp4 → NAS ✓ 242K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-59-11.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-59-41.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_13-00-11.mp4 → NAS ✓ 194K\n rsync MacBook Pro Microphone (input)_2026-05-12_13-35-39.mp4 → NAS ✓ 208K\n rsync LakyLak bose qc35 II (input)_2026-05-12_14-08-11.mp4 → NAS ✓ 217K\n rsync LakyLak bose qc35 II (input)_2026-05-12_14-10-40.mp4 → NAS ✓ 207K\n rsync LakyLak bose qc35 II (input)_2026-05-12_14-15-09.mp4 → NAS ✓ 198K\n rsync LakyLak bose qc35 II (input)_2026-05-12_14-21-38.mp4 → NAS ✓ 221K\n rsync LakyLak bose qc35 II (input)_2026-05-12_14-22-38.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-21-56.mp4 → NAS ✓ 195K\n rsync System Audio (output)_2026-05-12_17-21-56.mp4 → NAS ✓ 5.0K\n rsync System Audio (output)_2026-05-12_17-22-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-22-28.mp4 → NAS ✓ 195K\n rsync System Audio (output)_2026-05-12_17-22-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-22-58.mp4 → NAS ✓ 196K\n rsync System Audio (output)_2026-05-12_17-23-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-23-28.mp4 → NAS ✓ 200K\n rsync System Audio (output)_2026-05-12_17-23-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-23-58.mp4 → NAS ✓ 196K\n rsync System Audio (output)_2026-05-12_17-24-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-24-28.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-24-58.mp4 → NAS ✓ 199K\n rsync System Audio (output)_2026-05-12_17-24-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-25-28.mp4 → NAS ✓ 195K\n rsync System Audio (output)_2026-05-12_17-25-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-25-58.mp4 → NAS ✓ 207K\n rsync System Audio (output)_2026-05-12_17-25-58.mp4 → NAS ✓ 5.0K\n rsync System Audio (output)_2026-05-12_17-26-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-26-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-26-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-26-58.mp4 → NAS ✓ 206K\n rsync System Audio (output)_2026-05-12_17-27-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-27-28.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-27-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-27-58.mp4 → NAS ✓ 194K\n rsync System Audio (output)_2026-05-12_17-28-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-28-28.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-12_17-28-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-28-58.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-29-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-29-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-29-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-29-58.mp4 → NAS ✓ 203K\n rsync System Audio (output)_2026-05-12_17-30-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-30-28.mp4 → NAS ✓ 196K\n rsync System Audio (output)_2026-05-12_17-30-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-30-58.mp4 → NAS ✓ 212K\n rsync System Audio (output)_2026-05-12_17-31-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-31-28.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-31-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-31-58.mp4 → NAS ✓ 199K\n rsync System Audio (output)_2026-05-12_17-32-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-32-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-32-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-32-58.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-12_17-33-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-33-28.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-33-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-33-58.mp4 → NAS ✓ 204K\n rsync System Audio (output)_2026-05-12_17-34-28.mp4 → NAS ✓ 8.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-34-28.mp4 → NAS ✓ 207K\n rsync System Audio (output)_2026-05-12_17-34-58.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-34-58.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-12_17-35-28.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-35-28.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-35-58.mp4 → NAS ✓ 8.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-35-58.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-12_17-36-28.mp4 → NAS ✓ 15K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-36-28.mp4 → NAS ✓ 211K\n rsync System Audio (output)_2026-05-12_17-36-58.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-36-58.mp4 → NAS ✓ 213K\n rsync System Audio (output)_2026-05-12_17-37-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-37-28.mp4 → NAS ✓ 209K\n rsync System Audio (output)_2026-05-12_17-37-58.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-37-58.mp4 → NAS ✓ 206K\n rsync System Audio (output)_2026-05-12_17-38-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-38-28.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-12_17-38-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-38-58.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-39-28.mp4 → NAS ✓ 12K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-39-28.mp4 → NAS ✓ 215K\n rsync System Audio (output)_2026-05-12_17-39-58.mp4 → NAS ✓ 7.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-39-58.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-12_17-40-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-40-28.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-40-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-40-58.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-12_17-41-28.mp4 → NAS ✓ 9.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-41-28.mp4 → NAS ✓ 215K\n rsync System Audio (output)_2026-05-12_17-41-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-41-58.mp4 → NAS ✓ 211K\n rsync System Audio (output)_2026-05-12_17-42-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-42-28.mp4 → NAS ✓ 203K\n rsync System Audio (output)_2026-05-12_17-42-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-42-58.mp4 → NAS ✓ 207K\n rsync System Audio (output)_2026-05-12_17-43-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-43-28.mp4 → NAS ✓ 214K\n rsync System Audio (output)_2026-05-12_17-43-58.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-43-58.mp4 → NAS ✓ 211K\n rsync System Audio (output)_2026-05-12_17-44-28.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-44-28.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-44-58.mp4 → NAS ✓ 7.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-44-58.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-45-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-45-28.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-45-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-45-58.mp4 → NAS ✓ 200K\n rsync System Audio (output)_2026-05-12_17-46-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-46-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-46-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-46-58.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-47-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-47-28.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-12_17-47-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-47-58.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-48-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-48-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-48-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-48-58.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-49-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-49-28.mp4 → NAS ✓ 211K\n rsync System Audio (output)_2026-05-12_17-49-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-49-58.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-12_17-50-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-50-28.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-50-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-50-58.mp4 → NAS ✓ 204K\n rsync System Audio (output)_2026-05-12_17-51-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-51-28.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-51-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-51-58.mp4 → NAS ✓ 204K\n rsync System Audio (output)_2026-05-12_17-52-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-52-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-52-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-52-58.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-53-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-53-28.mp4 → NAS ✓ 213K\n rsync System Audio (output)_2026-05-12_17-53-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-53-58.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-54-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-54-28.mp4 → NAS ✓ 219K\n rsync System Audio (output)_2026-05-12_17-54-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-54-58.mp4 → NAS ✓ 211K\n rsync System Audio (output)_2026-05-12_17-55-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-55-28.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-55-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-55-58.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-56-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-56-28.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-58-58.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-59-28.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-00-28.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-00-58.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-01-58.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-04-58.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-08-27.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-09-27.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-09-57.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-10-27.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-10-57.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-11-27.mp4 → NAS ✓ 218K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-11-57.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-12-57.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-14-57.mp4 → NAS ✓ 228K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-15-27.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-15-57.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-16-57.mp4 → NAS ✓ 221K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-19-27.mp4 → NAS ✓ 215K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-19-57.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-20-27.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-20-56.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-21-26.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-21-56.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-22-26.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-22-56.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-23-26.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-23-56.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-24-26.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-24-56.mp4 → NAS ✓ 217K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-25-56.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-26-26.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-26-56.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-28-26.mp4 → NAS ✓ 216K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-28-56.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-29-56.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-30-56.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-31-56.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-32-26.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-32-56.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-33-56.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-34-56.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-35-26.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-35-56.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-36-26.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-36-56.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-37-26.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-37-56.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-38-26.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-38-56.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-39-26.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-39-56.mp4 → NAS ✓ 195K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-40-26.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-40-56.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-41-26.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-42-26.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-43-26.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-44-26.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-45-26.mp4 → NAS ✓ 207K\n audio files total: 1113 file(s), 145M\n\n[+09m26s] ▶ Copying screenpipe logs for 2026-05-12\n rsync logs → NAS ✓ 1 file(s), 288K\n\n[2026-05-13 21:46:00] Archive DB size: 2.0G\n[2026-05-13 21:46:00] Total time: 9m26s\n[2026-05-13 21:46:00] Sync complete for 2026-05-12\n[2026-05-13 21:46:00] ========================================\nlukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13\n[2026-05-14 09:28:19] ========================================\n[2026-05-14 09:28:19] Screenpipe sync starting for: 2026-05-13\n[2026-05-14 09:28:19] ========================================\n\n[+00m00s] ▶ Preflight checks\n Source DB: OK (5.6G)\n[2026-05-14 09:28:19] ERROR: NAS not mounted at /Volumes/screenpipe\nlukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13\n[2026-05-14 09:28:31] ========================================\n[2026-05-14 09:28:31] Screenpipe sync starting for: 2026-05-13\n[2026-05-14 09:28:31] ========================================\n\n[+00m00s] ▶ Preflight checks\n Source DB: OK (5.6G)\n NAS mount: OK /Volumes/screenpipe\n Archive DB: exists (2.0G)\n Data dir: OK (263 files, 541M)\n\n[+00m04s] ▶ Counting source rows for 2026-05-13\n frames: 9586\n elements: 1272090\n ui_events: 9151\n ocr_text: 2829\n meetings: 0\n audio_chunks: 1295\n audio_transcriptions: 102\n\n[+00m06s] ▶ Initialising tables, indexes, FTS\n creating tables ✓ 0m00s\n creating indexes ✓ 0m00s\n creating FTS tables ✓ 0m00s\n\n[+00m06s] ▶ Syncing vision data for 2026-05-13\n video_chunks ✓ 0m01s\n frames (9586 rows) ✓ 2m16s\n ocr_text (2829 rows) ✓ 1m22s\n ui_events (9151 rows) ✓ 0m01s\n elements (1272090 rows) ✓ 1m26s\n meetings (0 rows) ⠋ Runtime error near line 2: database is locked (5)\nParse error near line 3: no such table: nas.meetings\nRuntime error near line 5: no such database: nas\nlukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13\n[2026-05-14 09:34:11] ========================================\n[2026-05-14 09:34:11] Screenpipe sync starting for: 2026-05-13\n[2026-05-14 09:34:11] ========================================\n\n[+00m00s] ▶ Preflight checks\n Source DB: OK (5.6G)\n NAS mount: OK /Volumes/screenpipe\n[2026-05-14 09:34:11] Date 2026-05-13 already has 9586 frames in archive — skipping DB sync\n Data dir: OK (263 files, 541M)\n\n[+00m00s] ▶ Copying data folder for 2026-05-13\n rsync 2026-05-13/ → NAS ✓ 0m34s (263 files, 524M)\n\n[+00m34s] ▶ Copying audio files for 2026-05-13\n rsync System Audio (output)_2026-05-13_06-16-53.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-16-53.mp4 → NAS ✓ 188K\n rsync System Audio (output)_2026-05-13_06-17-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-17-25.mp4 → NAS ✓ 190K\n rsync System Audio (output)_2026-05-13_06-17-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-17-55.mp4 → NAS ✓ 192K\n rsync System Audio (output)_2026-05-13_06-18-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-18-25.mp4 → NAS ✓ 191K\n rsync System Audio (output)_2026-05-13_06-18-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-18-55.mp4 → NAS ✓ 191K\n rsync System Audio (output)_2026-05-13_06-19-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-19-25.mp4 → NAS ✓ 197K\n rsync System Audio (output)_2026-05-13_06-19-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-19-55.mp4 → NAS ✓ 194K\n rsync System Audio (output)_2026-05-13_06-20-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-20-25.mp4 → NAS ✓ 195K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-20-55.mp4 → NAS ✓ 187K\n rsync System Audio (output)_2026-05-13_06-20-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-21-25.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-13_06-21-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-21-55.mp4 → NAS ✓ 190K\n rsync System Audio (output)_2026-05-13_06-21-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-22-25.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-13_06-22-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-22-55.mp4 → NAS ✓ 192K\n rsync System Audio (output)_2026-05-13_06-22-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-23-25.mp4 → NAS ✓ 198K\n rsync System Audio (output)_2026-05-13_06-23-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-23-55.mp4 → NAS ✓ 199K\n rsync System Audio (output)_2026-05-13_06-23-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-24-25.mp4 → NAS ✓ 195K\n rsync System Audio (output)_2026-05-13_06-24-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-24-55.mp4 → NAS ✓ 209K\n rsync System Audio (output)_2026-05-13_06-24-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-25-25.mp4 → NAS ✓ 203K\n rsync System Audio (output)_2026-05-13_06-25-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-25-55.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-13_06-25-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-26-25.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-13_06-26-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-26-55.mp4 → NAS ✓ 198K\n rsync System Audio (output)_2026-05-13_06-26-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-27-25.mp4 → NAS ✓ 199K\n rsync System Audio (output)_2026-05-13_06-27-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-27-55.mp4 → NAS ✓ 200K\n rsync System Audio (output)_2026-05-13_06-27-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-28-25.mp4 → NAS ✓ 200K\n rsync System Audio (output)_2026-05-13_06-28-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-28-55.mp4 → NAS ✓ 213K\n rsync System Audio (output)_2026-05-13_06-28-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-29-25.mp4 → NAS ✓ 233K\n rsync System Audio (output)_2026-05-13_06-29-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-29-55.mp4 → NAS ✓ 215K\n rsync System Audio (output)_2026-05-13_06-29-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-30-24.mp4 → NAS ✓ 197K\n rsync System Audio (output)_2026-05-13_06-30-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-30-54.mp4 → NAS ✓ 195K\n rsync System Audio (output)_2026-05-13_06-30-54.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-31-24.mp4 → NAS ✓ 198K\n rsync System Audio (output)_2026-05-13_06-31-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-31-54.mp4 → NAS ✓ 192K\n rsync System Audio (output)_2026-05-13_06-31-54.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-32-24.mp4 → NAS ✓ 197K\n rsync System Audio (output)_2026-05-13_06-32-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-32-54.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-13_06-32-54.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-33-24.mp4 → NAS ✓ 199K\n rsync System Audio (output)_2026-05-13_06-33-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-33-54.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-13_06-33-54.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-34-24.mp4 → NAS ✓ 188K\n rsync System Audio (output)_2026-05-13_06-34-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-34-54.mp4 → NAS ✓ 184K\n rsync System Audio (output)_2026-05-13_06-34-54.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-35-24.mp4 → NAS ✓ 192K\n rsync System Audio (output)_2026-05-13_06-35-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-35-54.mp4 → NAS ✓ 200K\n rsync System Audio (output)_2026-05-13_06-35-54.mp4 → NAS ✓ 5.0K\n rsync System Audio (output)_2026-05-13_06-36-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-36-24.mp4 → NAS ✓ 189K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-36-54.mp4 → NAS ✓ 120K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-37-12.mp4 → NAS ✓ 16K\n rsync System Audio (output)_2026-05-13_06-36-54.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_06-37-12.mp4 → NAS ✓ 103K\n rsync System Audio (output)_2026-05-13_06-37-23.mp4 → NAS ✓ 191K\n rsync soundcore AeroClip (input)_2026-05-13_06-37-44.mp4 → NAS ✓ 64K\n rsync System Audio (output)_2026-05-13_06-37-53.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-38-14.mp4 → NAS ✓ 86K\n rsync System Audio (output)_2026-05-13_06-38-23.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-38-44.mp4 → NAS ✓ 23K\n rsync System Audio (output)_2026-05-13_06-38-53.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-39-14.mp4 → NAS ✓ 86K\n rsync System Audio (output)_2026-05-13_06-39-23.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-39-44.mp4 → NAS ✓ 82K\n rsync System Audio (output)_2026-05-13_06-39-53.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-40-14.mp4 → NAS ✓ 68K\n rsync System Audio (output)_2026-05-13_06-40-23.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-40-44.mp4 → NAS ✓ 68K\n rsync System Audio (output)_2026-05-13_06-40-53.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-41-14.mp4 → NAS ✓ 171K\n rsync System Audio (output)_2026-05-13_06-41-23.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-41-44.mp4 → NAS ✓ 67K\n rsync System Audio (output)_2026-05-13_06-41-53.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_06-42-14.mp4 → NAS ✓ 14K\n rsync System Audio (output)_2026-05-13_06-42-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-42-44.mp4 → NAS ✓ 19K\n rsync System Audio (output)_2026-05-13_06-42-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-43-14.mp4 → NAS ✓ 93K\n rsync System Audio (output)_2026-05-13_06-43-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-43-44.mp4 → NAS ✓ 157K\n rsync System Audio (output)_2026-05-13_06-43-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-44-14.mp4 → NAS ✓ 128K\n rsync System Audio (output)_2026-05-13_06-44-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-44-44.mp4 → NAS ✓ 134K\n rsync System Audio (output)_2026-05-13_06-44-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-45-14.mp4 → NAS ✓ 113K\n rsync System Audio (output)_2026-05-13_06-45-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-45-44.mp4 → NAS ✓ 113K\n rsync System Audio (output)_2026-05-13_06-45-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-46-14.mp4 → NAS ✓ 148K\n rsync System Audio (output)_2026-05-13_06-46-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-46-44.mp4 → NAS ✓ 75K\n rsync System Audio (output)_2026-05-13_06-46-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-47-14.mp4 → NAS ✓ 10K\n rsync System Audio (output)_2026-05-13_06-47-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-47-44.mp4 → NAS ✓ 31K\n rsync System Audio (output)_2026-05-13_06-47-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-48-14.mp4 → NAS ✓ 16K\n rsync System Audio (output)_2026-05-13_06-48-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-48-44.mp4 → NAS ✓ 24K\n rsync System Audio (output)_2026-05-13_06-48-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-49-14.mp4 → NAS ✓ 11K\n rsync System Audio (output)_2026-05-13_06-49-22.mp4 → NAS ✓ 5.0K\n rsync System Audio (output)_2026-05-13_06-49-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-49-51.mp4 → NAS ✓ 10K\n rsync System Audio (output)_2026-05-13_06-50-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-50-23.mp4 → NAS ✓ 69K\n rsync System Audio (output)_2026-05-13_06-50-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-50-53.mp4 → NAS ✓ 67K\n rsync System Audio (output)_2026-05-13_06-51-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-51-23.mp4 → NAS ✓ 23K\n rsync System Audio (output)_2026-05-13_06-51-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-51-54.mp4 → NAS ✓ 63K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-52-42.mp4 → NAS ✓ 27K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-52-48.mp4 → NAS ✓ 15K\n rsync System Audio (output)_2026-05-13_06-52-22.mp4 → NAS ✓ 5.0K\n rsync System Audio (output)_2026-05-13_06-52-52.mp4 → NAS ✓ 5.0K\n rsync LakyLak bose qc35 II (input)_2026-05-13_06-52-58.mp4 → NAS ✓ 183K\n rsync System Audio (output)_2026-05-13_06-53-22.mp4 → NAS ✓ 5.0K\n rsync LakyLak bose qc35 II (input)_2026-05-13_06-53-30.mp4 → NAS ✓ 191K\n rsync System Audio (output)_2026-05-13_06-53-52.mp4 → NAS ✓ 5.0K\n rsync LakyLak bose qc35 II (input)_2026-05-13_06-54-00.mp4 → NAS ✓ 191K\n rsync System Audio (output)_2026-05-13_06-54-22.mp4 → NAS ✓ 5.0K\n rsync LakyLak bose qc35 II (input)_2026-05-13_06-54-30.mp4 → NAS ✓ 187K\n rsync System Audio (output)_2026-05-13_06-54-52.mp4 → NAS ✓ 5.0K\n rsync LakyLak bose qc35 II (input)_2026-05-13_06-55-00.mp4 → NAS ✓ 188K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-00-33.mp4 → NAS ✓ 215K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-01-03.mp4 → NAS ✓ 212K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-01-33.mp4 → NAS ✓ 202K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-02-03.mp4 → NAS ✓ 220K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-02-33.mp4 → NAS ✓ 196K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-03-03.mp4 → NAS ✓ 196K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-03-33.mp4 → NAS ✓ 199K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-04-03.mp4 → NAS ✓ 201K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-04-33.mp4 → NAS ✓ 198K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-05-03.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-05-33.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-06-03.mp4 → NAS ✓ 198K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-06-33.mp4 → NAS ✓ 194K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-07-03.mp4 → NAS ✓ 196K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-07-33.mp4 → NAS ✓ 200K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-08-03.mp4 → NAS ✓ 203K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-08-33.mp4 → NAS ✓ 214K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-09-03.mp4 → NAS ✓ 216K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-09-33.mp4 → NAS ✓ 196K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-10-03.mp4 → NAS ✓ 194K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-10-33.mp4 → NAS ✓ 198K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-11-03.mp4 → NAS ✓ 200K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-11-33.mp4 → NAS ✓ 237K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-12-03.mp4 → NAS ✓ 227K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-12-33.mp4 → NAS ✓ 225K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-13-03.mp4 → NAS ✓ 217K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-13-33.mp4 → NAS ✓ 204K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-14-03.mp4 → NAS ✓ 202K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-14-33.mp4 → NAS ✓ 204K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-15-03.mp4 → NAS ✓ 203K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-15-33.mp4 → NAS ✓ 207K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-16-03.mp4 → NAS ✓ 203K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-16-33.mp4 → NAS ✓ 202K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-17-03.mp4 → NAS ✓ 199K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-17-33.mp4 → NAS ✓ 207K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-18-03.mp4 → NAS ✓ 203K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-18-33.mp4 → NAS ✓ 202K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-19-03.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-19-33.mp4 → NAS ✓ 204K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-20-03.mp4 → NAS ✓ 200K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-20-33.mp4 → NAS ✓ 195K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-21-03.mp4 → NAS ✓ 196K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-21-33.mp4 → NAS ✓ 201K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-22-03.mp4 → NAS ✓ 214K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-25-32.mp4 → NAS ✓ 220K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-27-32.mp4 → NAS ✓ 239K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-28-02.mp4 → NAS ✓ 212K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-28-32.mp4 → NAS ✓ 213K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-29-02.mp4 → NAS ✓ 227K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-29-32.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-30-02.mp4 → NAS ✓ 197K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-30-32.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-31-02.mp4 → NAS ✓ 213K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-31-32.mp4 → NAS ✓ 204K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-32-02.mp4 → NAS ✓ 203K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-32-32.mp4 → NAS ✓ 211K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-33-02.mp4 → NAS ✓ 206K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-33-32.mp4 → NAS ✓ 211K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-34-02.mp4 → NAS ✓ 208K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-34-32.mp4 → NAS ✓ 210K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-35-02.mp4 → NAS ✓ 208K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-35-32.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-36-02.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-36-32.mp4 → NAS ✓ 216K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-37-02.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-37-32.mp4 → NAS ✓ 207K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-38-02.mp4 → NAS ✓ 206K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-38-32.mp4 → NAS ✓ 226K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-39-02.mp4 → NAS ✓ 206K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-39-32.mp4 → NAS ✓ 214K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-40-02.mp4 → NAS ✓ 212K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-40-32.mp4 → NAS ✓ 214K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-41-02.mp4 → NAS ✓ 214K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-41-32.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-42-02.mp4 → NAS ✓ 201K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-42-32.mp4 → NAS ✓ 207K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-43-02.mp4 → NAS ✓ 210K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-53-01.mp4 → NAS ✓ 223K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-53-37.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-54-09.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-54-39.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-55-09.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-55-39.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-56-08.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-56-39.mp4 → NAS ✓ 115K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-56-55.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_07-57-19.mp4 → NAS ✓ 37K\n rsync soundcore AeroClip (input)_2026-05-13_07-57-51.mp4 → NAS ✓ 98K\n rsync soundcore AeroClip (input)_2026-05-13_07-58-21.mp4 → NAS ✓ 159K\n rsync soundcore AeroClip (input)_2026-05-13_07-58-51.mp4 → NAS ✓ 123K\n rsync soundcore AeroClip (input)_2026-05-13_07-59-21.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_07-59-51.mp4 → NAS ✓ 50K\n rsync soundcore AeroClip (input)_2026-05-13_08-00-21.mp4 → NAS ✓ 62K\n rsync soundcore AeroClip (input)_2026-05-13_08-00-50.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_08-01-20.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-01-50.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_08-02-20.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_08-02-50.mp4 → NAS ✓ 33K\n rsync soundcore AeroClip (input)_2026-05-13_08-03-20.mp4 → NAS ✓ 77K\n rsync soundcore AeroClip (input)_2026-05-13_08-03-50.mp4 → NAS ✓ 116K\n rsync soundcore AeroClip (input)_2026-05-13_08-04-20.mp4 → NAS ✓ 119K\n rsync soundcore AeroClip (input)_2026-05-13_08-04-50.mp4 → NAS ✓ 169K\n rsync soundcore AeroClip (input)_2026-05-13_08-05-20.mp4 → NAS ✓ 103K\n rsync soundcore AeroClip (input)_2026-05-13_08-05-50.mp4 → NAS ✓ 41K\n rsync soundcore AeroClip (input)_2026-05-13_08-06-20.mp4 → NAS ✓ 34K\n rsync soundcore AeroClip (input)_2026-05-13_08-06-50.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-07-20.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_08-07-50.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_08-08-20.mp4 → NAS ✓ 41K\n rsync soundcore AeroClip (input)_2026-05-13_08-08-50.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_08-09-20.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-09-50.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-10-20.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-10-58.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-11-30.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-12-00.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-12-30.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-13-00.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_08-13-30.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_08-14-00.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-14-30.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-15-00.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_08-15-30.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-16-00.mp4 → NAS ✓ 30K\n rsync soundcore AeroClip (input)_2026-05-13_08-16-30.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-17-00.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-17-30.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-18-00.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-18-30.mp4 → NAS ✓ 32K\n rsync soundcore AeroClip (input)_2026-05-13_08-19-00.mp4 → NAS ✓ 30K\n rsync soundcore AeroClip (input)_2026-05-13_08-19-30.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-20-00.mp4 → NAS ✓ 41K\n rsync soundcore AeroClip (input)_2026-05-13_08-20-30.mp4 → NAS ✓ 35K\n rsync soundcore AeroClip (input)_2026-05-13_08-21-00.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_08-21-30.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-22-00.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-22-30.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-23-00.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_08-23-30.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-24-00.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-24-30.mp4 → NAS ✓ 33K\n rsync soundcore AeroClip (input)_2026-05-13_08-25-00.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-25-30.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_08-26-00.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-26-29.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-26-59.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-27-29.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_08-27-59.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_08-28-29.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-28-59.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-29-29.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_08-29-59.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-30-29.mp4 → NAS ✓ 38K\n rsync soundcore AeroClip (input)_2026-05-13_08-30-59.mp4 → NAS ✓ 35K\n rsync soundcore AeroClip (input)_2026-05-13_08-31-29.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_08-31-59.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_08-32-29.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-32-59.mp4 → NAS ✓ 22K\n rsync soundcore AeroClip (input)_2026-05-13_08-33-29.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-33-59.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-34-29.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-34-59.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-35-29.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-35-59.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-36-29.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-36-59.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-37-53.mp4 → NAS ✓ 44K\n rsync soundcore AeroClip (input)_2026-05-13_08-38-25.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_08-38-55.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-39-25.mp4 → NAS ✓ 28K\n rsync soundcore AeroClip (input)_2026-05-13_08-39-55.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_08-40-25.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_08-40-55.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-41-25.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-41-55.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-42-25.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-42-55.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_08-43-25.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-43-55.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_08-44-25.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_08-44-55.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-45-25.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-45-55.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-46-25.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-46-55.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-47-25.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-47-55.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-48-25.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-48-55.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-49-25.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_08-49-55.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-50-25.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_08-50-55.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-51-25.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-51-55.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-52-25.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-52-55.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-53-25.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-53-55.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_08-54-25.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-54-55.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-55-25.mp4 → NAS ✓ 34K\n rsync soundcore AeroClip (input)_2026-05-13_08-55-55.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_08-56-25.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-56-55.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-57-25.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_08-57-55.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_08-58-25.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-58-55.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-59-25.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_08-59-55.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_09-00-25.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-00-55.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_09-01-25.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-01-55.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-02-25.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-02-55.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_09-03-25.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_09-03-55.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-04-25.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-04-55.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-05-25.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_09-05-55.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_09-06-25.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-07-06.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-07-39.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-08-08.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_09-08-38.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_09-09-08.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_09-09-38.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-10-08.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_09-10-38.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-11-08.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_09-11-38.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-12-08.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-12-38.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-13-08.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_09-13-38.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-14-08.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-14-38.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-15-08.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-15-38.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_09-16-08.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-16-38.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-17-08.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-17-38.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_09-18-08.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-18-56.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_09-19-28.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_09-19-58.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_09-20-28.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_09-21-21.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-21-53.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_09-22-23.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-22-53.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-23-23.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_09-23-53.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-24-23.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-24-53.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-25-23.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_09-25-53.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_09-26-23.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-26-53.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-27-23.mp4 → NAS ✓ 52K\n rsync soundcore AeroClip (input)_2026-05-13_09-27-53.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-28-23.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_09-28-53.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-29-23.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_09-29-53.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-30-23.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-30-53.mp4 → NAS ✓ 44K\n rsync soundcore AeroClip (input)_2026-05-13_09-31-23.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-31-53.mp4 → NAS ✓ 22K\n rsync soundcore AeroClip (input)_2026-05-13_09-32-23.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-32-53.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_09-33-23.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-33-53.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-34-23.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-34-53.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-35-23.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_09-35-53.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-36-23.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-36-53.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_09-37-23.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_09-37-53.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_09-38-23.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_09-38-53.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_09-39-22.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_09-39-52.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-40-36.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_09-41-08.mp4 → NAS ✓ 63K\n rsync soundcore AeroClip (input)_2026-05-13_09-41-38.mp4 → NAS ✓ 37K\n rsync soundcore AeroClip (input)_2026-05-13_09-42-30.mp4 → NAS ✓ 64K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-45-33.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-49-33.mp4 → NAS ✓ 196K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-50-03.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-50-33.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-51-02.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-51-32.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-52-02.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-52-32.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-53-02.mp4 → NAS ✓ 194K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-53-32.mp4 → NAS ✓ 189K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-54-02.mp4 → NAS ✓ 191K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-54-32.mp4 → NAS ✓ 196K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-55-02.mp4 → NAS ✓ 196K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-55-32.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-56-02.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-56-32.mp4 → NAS ✓ 187K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-57-02.mp4 → NAS ✓ 189K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-57-31.mp4 → NAS ✓ 194K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-58-01.mp4 → NAS ✓ 192K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-58-31.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-59-01.mp4 → NAS ✓ 195K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-59-31.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_10-00-01.mp4 → NAS ✓ 167K\n rsync MacBook Pro Microphone (input)_2026-05-13_10-00-25.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_10-00-42.mp4 → NAS ✓ 22K\n rsync soundcore AeroClip (input)_2026-05-13_10-01-14.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_10-01-44.mp4 → NAS ✓ 50K\n rsync soundcore AeroClip (input)_2026-05-13_10-02-14.mp4 → NAS ✓ 66K\n rsync soundcore AeroClip (input)_2026-05-13_10-02-44.mp4 → NAS ✓ 54K\n rsync soundcore AeroClip (input)_2026-05-13_10-03-14.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_10-03-43.mp4 → NAS ✓ 48K\n rsync soundcore AeroClip (input)_2026-05-13_10-04-13.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_10-04-43.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_10-05-13.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_10-05-43.mp4 → NAS ✓ 32K\n rsync soundcore AeroClip (input)_2026-05-13_10-06-13.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_10-06-43.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_10-07-13.mp4 → NAS ✓ 31K\n rsync soundcore AeroClip (input)_2026-05-13_10-07-43.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-08-13.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-08-43.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_10-09-13.mp4 → NAS ✓ 57K\n rsync soundcore AeroClip (input)_2026-05-13_10-09-43.mp4 → NAS ✓ 49K\n rsync soundcore AeroClip (input)_2026-05-13_10-10-13.mp4 → NAS ✓ 77K\n rsync soundcore AeroClip (input)_2026-05-13_10-10-43.mp4 → NAS ✓ 91K\n rsync soundcore AeroClip (input)_2026-05-13_10-11-13.mp4 → NAS ✓ 44K\n rsync soundcore AeroClip (input)_2026-05-13_10-11-43.mp4 → NAS ✓ 65K\n rsync soundcore AeroClip (input)_2026-05-13_10-12-13.mp4 → NAS ✓ 95K\n rsync soundcore AeroClip (input)_2026-05-13_10-12-43.mp4 → NAS ✓ 96K\n rsync soundcore AeroClip (input)_2026-05-13_10-13-13.mp4 → NAS ✓ 58K\n rsync soundcore AeroClip (input)_2026-05-13_10-13-43.mp4 → NAS ✓ 87K\n rsync soundcore AeroClip (input)_2026-05-13_10-14-13.mp4 → NAS ✓ 100K\n rsync soundcore AeroClip (input)_2026-05-13_10-14-43.mp4 → NAS ✓ 109K\n rsync soundcore AeroClip (input)_2026-05-13_10-15-13.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_10-15-43.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_10-16-13.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-16-50.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-17-23.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_10-17-53.mp4 → NAS ✓ 34K\n rsync soundcore AeroClip (input)_2026-05-13_10-18-23.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_10-18-53.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-19-23.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-19-53.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-20-23.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-20-53.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-21-23.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_10-21-53.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-22-49.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-23-36.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-24-08.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_10-24-38.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_10-25-08.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-25-38.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_10-26-08.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-26-38.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_10-27-08.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_10-27-38.mp4 → NAS ✓ 71K\n rsync soundcore AeroClip (input)_2026-05-13_10-28-08.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-28-38.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-29-08.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-29-38.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-30-08.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_10-30-38.mp4 → NAS ✓ 36K\n rsync soundcore AeroClip (input)_2026-05-13_10-31-08.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-31-43.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-32-15.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-32-45.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-33-15.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-33-45.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_10-34-15.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_10-34-45.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_10-35-15.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_10-35-45.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-36-15.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_10-36-45.mp4 → NAS ✓ 70K\n rsync soundcore AeroClip (input)_2026-05-13_10-37-15.mp4 → NAS ✓ 61K\n rsync soundcore AeroClip (input)_2026-05-13_10-37-45.mp4 → NAS ✓ 86K\n rsync soundcore AeroClip (input)_2026-05-13_10-38-15.mp4 → NAS ✓ 98K\n rsync soundcore AeroClip (input)_2026-05-13_10-38-45.mp4 → NAS ✓ 49K\n rsync soundcore AeroClip (input)_2026-05-13_10-39-15.mp4 → NAS ✓ 43K\n rsync soundcore AeroClip (input)_2026-05-13_10-39-45.mp4 → NAS ✓ 43K\n rsync soundcore AeroClip (input)_2026-05-13_10-40-15.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_10-40-45.mp4 → NAS ✓ 69K\n rsync soundcore AeroClip (input)_2026-05-13_10-41-15.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_10-41-45.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_10-42-15.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_10-42-45.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_10-43-15.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-43-45.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_10-44-15.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-44-54.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_10-45-27.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-45-56.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-46-26.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-46-56.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_10-47-26.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-48-15.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_10-48-47.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-49-17.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_10-49-47.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-50-18.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-50-50.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-51-20.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_10-51-50.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-52-20.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-52-50.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_10-53-20.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_10-53-50.mp4 → NAS ✓ 40K\n rsync soundcore AeroClip (input)_2026-05-13_10-54-20.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_10-54-50.mp4 → NAS ✓ 34K\n rsync soundcore AeroClip (input)_2026-05-13_10-55-20.mp4 → NAS ✓ 41K\n rsync soundcore AeroClip (input)_2026-05-13_10-55-50.mp4 → NAS ✓ 42K\n rsync soundcore AeroClip (input)_2026-05-13_10-56-20.mp4 → NAS ✓ 43K\n rsync soundcore AeroClip (input)_2026-05-13_10-56-50.mp4 → NAS ✓ 83K\n rsync soundcore AeroClip (input)_2026-05-13_10-57-20.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_10-57-50.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-58-20.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_10-58-50.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-59-20.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-59-50.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_11-00-20.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_11-00-50.mp4 → NAS ✓ 17K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-01-30.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-02-02.mp4 → NAS ✓ 194K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-02-32.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-03-02.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-03-32.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-04-02.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-04-32.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-05-02.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-05-32.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-06-02.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-06-31.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-07-01.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-07-31.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-08-01.mp4 → NAS ✓ 195K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-08-31.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-09-00.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-09-30.mp4 → NAS ✓ 194K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-10-00.mp4 → NAS ✓ 196K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-10-29.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-10-59.mp4 → NAS ✓ 192K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-11-29.mp4 → NAS ✓ 191K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-11-59.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-12-29.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-12-58.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-13-28.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-13-58.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-14-28.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-14-58.mp4 → NAS ✓ 218K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-15-28.mp4 → NAS ✓ 243K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-15-58.mp4 → NAS ✓ 215K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-16-28.mp4 → NAS","depth":4,"on_screen":true,"value":"rsync MacBook Pro Microphone (input)_2026-05-12_11-31-41.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-32-11.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-32-41.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-33-10.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-33-40.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-34-10.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-34-40.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-35-10.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-35-40.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-36-09.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-36-39.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-37-09.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-37-39.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-38-09.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-38-39.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-39-09.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-39-39.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-40-09.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-40-39.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-41-09.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-41-39.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-42-09.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-42-38.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-43-08.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-43-38.mp4 → NAS ✓ 219K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-44-08.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-44-37.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-45-07.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-45-36.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-46-05.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-46-35.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-47-05.mp4 → NAS ✓ 220K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-47-34.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-48-03.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-48-33.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-49-03.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-49-33.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-50-03.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-50-33.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-51-02.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-51-32.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-52-01.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-52-31.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-53-01.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-53-30.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-54-00.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-54-30.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-55-00.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-55-29.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-55-59.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-56-29.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-56-59.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-57-29.mp4 → NAS ✓ 220K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-57-59.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-58-28.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-58-58.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-59-28.mp4 → NAS ✓ 218K\n rsync MacBook Pro Microphone (input)_2026-05-12_11-59-58.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-00-27.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-00-57.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-01-27.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-01-57.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-02-27.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-02-57.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-03-27.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-03-57.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-04-27.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-04-57.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-05-26.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-05-56.mp4 → NAS ✓ 226K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-06-26.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-06-56.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-07-26.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-07-56.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-08-26.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-08-55.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-09-25.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-09-55.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-10-25.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-10-55.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-11-25.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-11-55.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-12-24.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-12-54.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-13-24.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-13-53.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-14-23.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-14-53.mp4 → NAS ✓ 220K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-15-23.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-15-53.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-16-23.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-16-53.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-17-53.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-18-22.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-18-52.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-19-22.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-19-52.mp4 → NAS ✓ 227K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-20-21.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-20-51.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-21-21.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-21-51.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-22-21.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-22-51.mp4 → NAS ✓ 242K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-59-11.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_12-59-41.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_13-00-11.mp4 → NAS ✓ 194K\n rsync MacBook Pro Microphone (input)_2026-05-12_13-35-39.mp4 → NAS ✓ 208K\n rsync LakyLak bose qc35 II (input)_2026-05-12_14-08-11.mp4 → NAS ✓ 217K\n rsync LakyLak bose qc35 II (input)_2026-05-12_14-10-40.mp4 → NAS ✓ 207K\n rsync LakyLak bose qc35 II (input)_2026-05-12_14-15-09.mp4 → NAS ✓ 198K\n rsync LakyLak bose qc35 II (input)_2026-05-12_14-21-38.mp4 → NAS ✓ 221K\n rsync LakyLak bose qc35 II (input)_2026-05-12_14-22-38.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-21-56.mp4 → NAS ✓ 195K\n rsync System Audio (output)_2026-05-12_17-21-56.mp4 → NAS ✓ 5.0K\n rsync System Audio (output)_2026-05-12_17-22-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-22-28.mp4 → NAS ✓ 195K\n rsync System Audio (output)_2026-05-12_17-22-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-22-58.mp4 → NAS ✓ 196K\n rsync System Audio (output)_2026-05-12_17-23-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-23-28.mp4 → NAS ✓ 200K\n rsync System Audio (output)_2026-05-12_17-23-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-23-58.mp4 → NAS ✓ 196K\n rsync System Audio (output)_2026-05-12_17-24-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-24-28.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-24-58.mp4 → NAS ✓ 199K\n rsync System Audio (output)_2026-05-12_17-24-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-25-28.mp4 → NAS ✓ 195K\n rsync System Audio (output)_2026-05-12_17-25-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-25-58.mp4 → NAS ✓ 207K\n rsync System Audio (output)_2026-05-12_17-25-58.mp4 → NAS ✓ 5.0K\n rsync System Audio (output)_2026-05-12_17-26-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-26-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-26-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-26-58.mp4 → NAS ✓ 206K\n rsync System Audio (output)_2026-05-12_17-27-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-27-28.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-27-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-27-58.mp4 → NAS ✓ 194K\n rsync System Audio (output)_2026-05-12_17-28-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-28-28.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-12_17-28-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-28-58.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-29-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-29-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-29-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-29-58.mp4 → NAS ✓ 203K\n rsync System Audio (output)_2026-05-12_17-30-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-30-28.mp4 → NAS ✓ 196K\n rsync System Audio (output)_2026-05-12_17-30-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-30-58.mp4 → NAS ✓ 212K\n rsync System Audio (output)_2026-05-12_17-31-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-31-28.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-31-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-31-58.mp4 → NAS ✓ 199K\n rsync System Audio (output)_2026-05-12_17-32-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-32-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-32-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-32-58.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-12_17-33-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-33-28.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-33-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-33-58.mp4 → NAS ✓ 204K\n rsync System Audio (output)_2026-05-12_17-34-28.mp4 → NAS ✓ 8.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-34-28.mp4 → NAS ✓ 207K\n rsync System Audio (output)_2026-05-12_17-34-58.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-34-58.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-12_17-35-28.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-35-28.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-35-58.mp4 → NAS ✓ 8.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-35-58.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-12_17-36-28.mp4 → NAS ✓ 15K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-36-28.mp4 → NAS ✓ 211K\n rsync System Audio (output)_2026-05-12_17-36-58.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-36-58.mp4 → NAS ✓ 213K\n rsync System Audio (output)_2026-05-12_17-37-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-37-28.mp4 → NAS ✓ 209K\n rsync System Audio (output)_2026-05-12_17-37-58.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-37-58.mp4 → NAS ✓ 206K\n rsync System Audio (output)_2026-05-12_17-38-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-38-28.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-12_17-38-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-38-58.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-39-28.mp4 → NAS ✓ 12K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-39-28.mp4 → NAS ✓ 215K\n rsync System Audio (output)_2026-05-12_17-39-58.mp4 → NAS ✓ 7.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-39-58.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-12_17-40-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-40-28.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-40-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-40-58.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-12_17-41-28.mp4 → NAS ✓ 9.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-41-28.mp4 → NAS ✓ 215K\n rsync System Audio (output)_2026-05-12_17-41-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-41-58.mp4 → NAS ✓ 211K\n rsync System Audio (output)_2026-05-12_17-42-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-42-28.mp4 → NAS ✓ 203K\n rsync System Audio (output)_2026-05-12_17-42-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-42-58.mp4 → NAS ✓ 207K\n rsync System Audio (output)_2026-05-12_17-43-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-43-28.mp4 → NAS ✓ 214K\n rsync System Audio (output)_2026-05-12_17-43-58.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-43-58.mp4 → NAS ✓ 211K\n rsync System Audio (output)_2026-05-12_17-44-28.mp4 → NAS ✓ 6.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-44-28.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-44-58.mp4 → NAS ✓ 7.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-44-58.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-45-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-45-28.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-45-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-45-58.mp4 → NAS ✓ 200K\n rsync System Audio (output)_2026-05-12_17-46-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-46-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-46-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-46-58.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-47-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-47-28.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-12_17-47-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-47-58.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-48-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-48-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-48-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-48-58.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-49-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-49-28.mp4 → NAS ✓ 211K\n rsync System Audio (output)_2026-05-12_17-49-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-49-58.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-12_17-50-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-50-28.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-50-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-50-58.mp4 → NAS ✓ 204K\n rsync System Audio (output)_2026-05-12_17-51-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-51-28.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-51-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-51-58.mp4 → NAS ✓ 204K\n rsync System Audio (output)_2026-05-12_17-52-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-52-28.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-52-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-52-58.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-53-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-53-28.mp4 → NAS ✓ 213K\n rsync System Audio (output)_2026-05-12_17-53-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-53-58.mp4 → NAS ✓ 208K\n rsync System Audio (output)_2026-05-12_17-54-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-54-28.mp4 → NAS ✓ 219K\n rsync System Audio (output)_2026-05-12_17-54-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-54-58.mp4 → NAS ✓ 211K\n rsync System Audio (output)_2026-05-12_17-55-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-55-28.mp4 → NAS ✓ 205K\n rsync System Audio (output)_2026-05-12_17-55-58.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-55-58.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-12_17-56-28.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-56-28.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-58-58.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_17-59-28.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-00-28.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-00-58.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-01-58.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-04-58.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-08-27.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-09-27.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-09-57.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-10-27.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-10-57.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-11-27.mp4 → NAS ✓ 218K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-11-57.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-12-57.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-14-57.mp4 → NAS ✓ 228K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-15-27.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-15-57.mp4 → NAS ✓ 213K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-16-57.mp4 → NAS ✓ 221K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-19-27.mp4 → NAS ✓ 215K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-19-57.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-20-27.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-20-56.mp4 → NAS ✓ 211K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-21-26.mp4 → NAS ✓ 204K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-21-56.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-22-26.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-22-56.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-23-26.mp4 → NAS ✓ 212K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-23-56.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-24-26.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-24-56.mp4 → NAS ✓ 217K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-25-56.mp4 → NAS ✓ 214K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-26-26.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-26-56.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-28-26.mp4 → NAS ✓ 216K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-28-56.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-29-56.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-30-56.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-31-56.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-32-26.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-32-56.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-33-56.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-34-56.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-35-26.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-35-56.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-36-26.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-36-56.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-37-26.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-37-56.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-38-26.mp4 → NAS ✓ 199K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-38-56.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-39-26.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-39-56.mp4 → NAS ✓ 195K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-40-26.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-40-56.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-41-26.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-42-26.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-43-26.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-44-26.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-12_18-45-26.mp4 → NAS ✓ 207K\n audio files total: 1113 file(s), 145M\n\n[+09m26s] ▶ Copying screenpipe logs for 2026-05-12\n rsync logs → NAS ✓ 1 file(s), 288K\n\n[2026-05-13 21:46:00] Archive DB size: 2.0G\n[2026-05-13 21:46:00] Total time: 9m26s\n[2026-05-13 21:46:00] Sync complete for 2026-05-12\n[2026-05-13 21:46:00] ========================================\nlukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13\n[2026-05-14 09:28:19] ========================================\n[2026-05-14 09:28:19] Screenpipe sync starting for: 2026-05-13\n[2026-05-14 09:28:19] ========================================\n\n[+00m00s] ▶ Preflight checks\n Source DB: OK (5.6G)\n[2026-05-14 09:28:19] ERROR: NAS not mounted at /Volumes/screenpipe\nlukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13\n[2026-05-14 09:28:31] ========================================\n[2026-05-14 09:28:31] Screenpipe sync starting for: 2026-05-13\n[2026-05-14 09:28:31] ========================================\n\n[+00m00s] ▶ Preflight checks\n Source DB: OK (5.6G)\n NAS mount: OK /Volumes/screenpipe\n Archive DB: exists (2.0G)\n Data dir: OK (263 files, 541M)\n\n[+00m04s] ▶ Counting source rows for 2026-05-13\n frames: 9586\n elements: 1272090\n ui_events: 9151\n ocr_text: 2829\n meetings: 0\n audio_chunks: 1295\n audio_transcriptions: 102\n\n[+00m06s] ▶ Initialising tables, indexes, FTS\n creating tables ✓ 0m00s\n creating indexes ✓ 0m00s\n creating FTS tables ✓ 0m00s\n\n[+00m06s] ▶ Syncing vision data for 2026-05-13\n video_chunks ✓ 0m01s\n frames (9586 rows) ✓ 2m16s\n ocr_text (2829 rows) ✓ 1m22s\n ui_events (9151 rows) ✓ 0m01s\n elements (1272090 rows) ✓ 1m26s\n meetings (0 rows) ⠋ Runtime error near line 2: database is locked (5)\nParse error near line 3: no such table: nas.meetings\nRuntime error near line 5: no such database: nas\nlukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13\n[2026-05-14 09:34:11] ========================================\n[2026-05-14 09:34:11] Screenpipe sync starting for: 2026-05-13\n[2026-05-14 09:34:11] ========================================\n\n[+00m00s] ▶ Preflight checks\n Source DB: OK (5.6G)\n NAS mount: OK /Volumes/screenpipe\n[2026-05-14 09:34:11] Date 2026-05-13 already has 9586 frames in archive — skipping DB sync\n Data dir: OK (263 files, 541M)\n\n[+00m00s] ▶ Copying data folder for 2026-05-13\n rsync 2026-05-13/ → NAS ✓ 0m34s (263 files, 524M)\n\n[+00m34s] ▶ Copying audio files for 2026-05-13\n rsync System Audio (output)_2026-05-13_06-16-53.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-16-53.mp4 → NAS ✓ 188K\n rsync System Audio (output)_2026-05-13_06-17-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-17-25.mp4 → NAS ✓ 190K\n rsync System Audio (output)_2026-05-13_06-17-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-17-55.mp4 → NAS ✓ 192K\n rsync System Audio (output)_2026-05-13_06-18-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-18-25.mp4 → NAS ✓ 191K\n rsync System Audio (output)_2026-05-13_06-18-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-18-55.mp4 → NAS ✓ 191K\n rsync System Audio (output)_2026-05-13_06-19-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-19-25.mp4 → NAS ✓ 197K\n rsync System Audio (output)_2026-05-13_06-19-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-19-55.mp4 → NAS ✓ 194K\n rsync System Audio (output)_2026-05-13_06-20-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-20-25.mp4 → NAS ✓ 195K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-20-55.mp4 → NAS ✓ 187K\n rsync System Audio (output)_2026-05-13_06-20-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-21-25.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-13_06-21-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-21-55.mp4 → NAS ✓ 190K\n rsync System Audio (output)_2026-05-13_06-21-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-22-25.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-13_06-22-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-22-55.mp4 → NAS ✓ 192K\n rsync System Audio (output)_2026-05-13_06-22-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-23-25.mp4 → NAS ✓ 198K\n rsync System Audio (output)_2026-05-13_06-23-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-23-55.mp4 → NAS ✓ 199K\n rsync System Audio (output)_2026-05-13_06-23-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-24-25.mp4 → NAS ✓ 195K\n rsync System Audio (output)_2026-05-13_06-24-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-24-55.mp4 → NAS ✓ 209K\n rsync System Audio (output)_2026-05-13_06-24-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-25-25.mp4 → NAS ✓ 203K\n rsync System Audio (output)_2026-05-13_06-25-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-25-55.mp4 → NAS ✓ 210K\n rsync System Audio (output)_2026-05-13_06-25-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-26-25.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-13_06-26-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-26-55.mp4 → NAS ✓ 198K\n rsync System Audio (output)_2026-05-13_06-26-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-27-25.mp4 → NAS ✓ 199K\n rsync System Audio (output)_2026-05-13_06-27-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-27-55.mp4 → NAS ✓ 200K\n rsync System Audio (output)_2026-05-13_06-27-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-28-25.mp4 → NAS ✓ 200K\n rsync System Audio (output)_2026-05-13_06-28-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-28-55.mp4 → NAS ✓ 213K\n rsync System Audio (output)_2026-05-13_06-28-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-29-25.mp4 → NAS ✓ 233K\n rsync System Audio (output)_2026-05-13_06-29-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-29-55.mp4 → NAS ✓ 215K\n rsync System Audio (output)_2026-05-13_06-29-55.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-30-24.mp4 → NAS ✓ 197K\n rsync System Audio (output)_2026-05-13_06-30-25.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-30-54.mp4 → NAS ✓ 195K\n rsync System Audio (output)_2026-05-13_06-30-54.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-31-24.mp4 → NAS ✓ 198K\n rsync System Audio (output)_2026-05-13_06-31-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-31-54.mp4 → NAS ✓ 192K\n rsync System Audio (output)_2026-05-13_06-31-54.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-32-24.mp4 → NAS ✓ 197K\n rsync System Audio (output)_2026-05-13_06-32-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-32-54.mp4 → NAS ✓ 201K\n rsync System Audio (output)_2026-05-13_06-32-54.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-33-24.mp4 → NAS ✓ 199K\n rsync System Audio (output)_2026-05-13_06-33-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-33-54.mp4 → NAS ✓ 202K\n rsync System Audio (output)_2026-05-13_06-33-54.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-34-24.mp4 → NAS ✓ 188K\n rsync System Audio (output)_2026-05-13_06-34-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-34-54.mp4 → NAS ✓ 184K\n rsync System Audio (output)_2026-05-13_06-34-54.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-35-24.mp4 → NAS ✓ 192K\n rsync System Audio (output)_2026-05-13_06-35-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-35-54.mp4 → NAS ✓ 200K\n rsync System Audio (output)_2026-05-13_06-35-54.mp4 → NAS ✓ 5.0K\n rsync System Audio (output)_2026-05-13_06-36-24.mp4 → NAS ✓ 5.0K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-36-24.mp4 → NAS ✓ 189K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-36-54.mp4 → NAS ✓ 120K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-37-12.mp4 → NAS ✓ 16K\n rsync System Audio (output)_2026-05-13_06-36-54.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_06-37-12.mp4 → NAS ✓ 103K\n rsync System Audio (output)_2026-05-13_06-37-23.mp4 → NAS ✓ 191K\n rsync soundcore AeroClip (input)_2026-05-13_06-37-44.mp4 → NAS ✓ 64K\n rsync System Audio (output)_2026-05-13_06-37-53.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-38-14.mp4 → NAS ✓ 86K\n rsync System Audio (output)_2026-05-13_06-38-23.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-38-44.mp4 → NAS ✓ 23K\n rsync System Audio (output)_2026-05-13_06-38-53.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-39-14.mp4 → NAS ✓ 86K\n rsync System Audio (output)_2026-05-13_06-39-23.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-39-44.mp4 → NAS ✓ 82K\n rsync System Audio (output)_2026-05-13_06-39-53.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-40-14.mp4 → NAS ✓ 68K\n rsync System Audio (output)_2026-05-13_06-40-23.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-40-44.mp4 → NAS ✓ 68K\n rsync System Audio (output)_2026-05-13_06-40-53.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-41-14.mp4 → NAS ✓ 171K\n rsync System Audio (output)_2026-05-13_06-41-23.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-41-44.mp4 → NAS ✓ 67K\n rsync System Audio (output)_2026-05-13_06-41-53.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_06-42-14.mp4 → NAS ✓ 14K\n rsync System Audio (output)_2026-05-13_06-42-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-42-44.mp4 → NAS ✓ 19K\n rsync System Audio (output)_2026-05-13_06-42-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-43-14.mp4 → NAS ✓ 93K\n rsync System Audio (output)_2026-05-13_06-43-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-43-44.mp4 → NAS ✓ 157K\n rsync System Audio (output)_2026-05-13_06-43-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-44-14.mp4 → NAS ✓ 128K\n rsync System Audio (output)_2026-05-13_06-44-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-44-44.mp4 → NAS ✓ 134K\n rsync System Audio (output)_2026-05-13_06-44-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-45-14.mp4 → NAS ✓ 113K\n rsync System Audio (output)_2026-05-13_06-45-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-45-44.mp4 → NAS ✓ 113K\n rsync System Audio (output)_2026-05-13_06-45-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-46-14.mp4 → NAS ✓ 148K\n rsync System Audio (output)_2026-05-13_06-46-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-46-44.mp4 → NAS ✓ 75K\n rsync System Audio (output)_2026-05-13_06-46-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-47-14.mp4 → NAS ✓ 10K\n rsync System Audio (output)_2026-05-13_06-47-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-47-44.mp4 → NAS ✓ 31K\n rsync System Audio (output)_2026-05-13_06-47-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-48-14.mp4 → NAS ✓ 16K\n rsync System Audio (output)_2026-05-13_06-48-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-48-44.mp4 → NAS ✓ 24K\n rsync System Audio (output)_2026-05-13_06-48-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-49-14.mp4 → NAS ✓ 11K\n rsync System Audio (output)_2026-05-13_06-49-22.mp4 → NAS ✓ 5.0K\n rsync System Audio (output)_2026-05-13_06-49-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-49-51.mp4 → NAS ✓ 10K\n rsync System Audio (output)_2026-05-13_06-50-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-50-23.mp4 → NAS ✓ 69K\n rsync System Audio (output)_2026-05-13_06-50-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-50-53.mp4 → NAS ✓ 67K\n rsync System Audio (output)_2026-05-13_06-51-22.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-51-23.mp4 → NAS ✓ 23K\n rsync System Audio (output)_2026-05-13_06-51-52.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_06-51-54.mp4 → NAS ✓ 63K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-52-42.mp4 → NAS ✓ 27K\n rsync MacBook Pro Microphone (input)_2026-05-13_06-52-48.mp4 → NAS ✓ 15K\n rsync System Audio (output)_2026-05-13_06-52-22.mp4 → NAS ✓ 5.0K\n rsync System Audio (output)_2026-05-13_06-52-52.mp4 → NAS ✓ 5.0K\n rsync LakyLak bose qc35 II (input)_2026-05-13_06-52-58.mp4 → NAS ✓ 183K\n rsync System Audio (output)_2026-05-13_06-53-22.mp4 → NAS ✓ 5.0K\n rsync LakyLak bose qc35 II (input)_2026-05-13_06-53-30.mp4 → NAS ✓ 191K\n rsync System Audio (output)_2026-05-13_06-53-52.mp4 → NAS ✓ 5.0K\n rsync LakyLak bose qc35 II (input)_2026-05-13_06-54-00.mp4 → NAS ✓ 191K\n rsync System Audio (output)_2026-05-13_06-54-22.mp4 → NAS ✓ 5.0K\n rsync LakyLak bose qc35 II (input)_2026-05-13_06-54-30.mp4 → NAS ✓ 187K\n rsync System Audio (output)_2026-05-13_06-54-52.mp4 → NAS ✓ 5.0K\n rsync LakyLak bose qc35 II (input)_2026-05-13_06-55-00.mp4 → NAS ✓ 188K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-00-33.mp4 → NAS ✓ 215K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-01-03.mp4 → NAS ✓ 212K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-01-33.mp4 → NAS ✓ 202K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-02-03.mp4 → NAS ✓ 220K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-02-33.mp4 → NAS ✓ 196K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-03-03.mp4 → NAS ✓ 196K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-03-33.mp4 → NAS ✓ 199K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-04-03.mp4 → NAS ✓ 201K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-04-33.mp4 → NAS ✓ 198K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-05-03.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-05-33.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-06-03.mp4 → NAS ✓ 198K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-06-33.mp4 → NAS ✓ 194K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-07-03.mp4 → NAS ✓ 196K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-07-33.mp4 → NAS ✓ 200K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-08-03.mp4 → NAS ✓ 203K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-08-33.mp4 → NAS ✓ 214K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-09-03.mp4 → NAS ✓ 216K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-09-33.mp4 → NAS ✓ 196K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-10-03.mp4 → NAS ✓ 194K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-10-33.mp4 → NAS ✓ 198K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-11-03.mp4 → NAS ✓ 200K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-11-33.mp4 → NAS ✓ 237K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-12-03.mp4 → NAS ✓ 227K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-12-33.mp4 → NAS ✓ 225K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-13-03.mp4 → NAS ✓ 217K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-13-33.mp4 → NAS ✓ 204K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-14-03.mp4 → NAS ✓ 202K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-14-33.mp4 → NAS ✓ 204K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-15-03.mp4 → NAS ✓ 203K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-15-33.mp4 → NAS ✓ 207K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-16-03.mp4 → NAS ✓ 203K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-16-33.mp4 → NAS ✓ 202K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-17-03.mp4 → NAS ✓ 199K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-17-33.mp4 → NAS ✓ 207K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-18-03.mp4 → NAS ✓ 203K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-18-33.mp4 → NAS ✓ 202K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-19-03.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-19-33.mp4 → NAS ✓ 204K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-20-03.mp4 → NAS ✓ 200K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-20-33.mp4 → NAS ✓ 195K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-21-03.mp4 → NAS ✓ 196K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-21-33.mp4 → NAS ✓ 201K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-22-03.mp4 → NAS ✓ 214K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-25-32.mp4 → NAS ✓ 220K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-27-32.mp4 → NAS ✓ 239K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-28-02.mp4 → NAS ✓ 212K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-28-32.mp4 → NAS ✓ 213K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-29-02.mp4 → NAS ✓ 227K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-29-32.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-30-02.mp4 → NAS ✓ 197K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-30-32.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-31-02.mp4 → NAS ✓ 213K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-31-32.mp4 → NAS ✓ 204K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-32-02.mp4 → NAS ✓ 203K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-32-32.mp4 → NAS ✓ 211K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-33-02.mp4 → NAS ✓ 206K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-33-32.mp4 → NAS ✓ 211K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-34-02.mp4 → NAS ✓ 208K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-34-32.mp4 → NAS ✓ 210K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-35-02.mp4 → NAS ✓ 208K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-35-32.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-36-02.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-36-32.mp4 → NAS ✓ 216K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-37-02.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-37-32.mp4 → NAS ✓ 207K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-38-02.mp4 → NAS ✓ 206K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-38-32.mp4 → NAS ✓ 226K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-39-02.mp4 → NAS ✓ 206K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-39-32.mp4 → NAS ✓ 214K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-40-02.mp4 → NAS ✓ 212K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-40-32.mp4 → NAS ✓ 214K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-41-02.mp4 → NAS ✓ 214K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-41-32.mp4 → NAS ✓ 205K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-42-02.mp4 → NAS ✓ 201K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-42-32.mp4 → NAS ✓ 207K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-43-02.mp4 → NAS ✓ 210K\n rsync LakyLak bose qc35 II (input)_2026-05-13_07-53-01.mp4 → NAS ✓ 223K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-53-37.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-54-09.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-54-39.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-55-09.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-55-39.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-56-08.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-56-39.mp4 → NAS ✓ 115K\n rsync MacBook Pro Microphone (input)_2026-05-13_07-56-55.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_07-57-19.mp4 → NAS ✓ 37K\n rsync soundcore AeroClip (input)_2026-05-13_07-57-51.mp4 → NAS ✓ 98K\n rsync soundcore AeroClip (input)_2026-05-13_07-58-21.mp4 → NAS ✓ 159K\n rsync soundcore AeroClip (input)_2026-05-13_07-58-51.mp4 → NAS ✓ 123K\n rsync soundcore AeroClip (input)_2026-05-13_07-59-21.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_07-59-51.mp4 → NAS ✓ 50K\n rsync soundcore AeroClip (input)_2026-05-13_08-00-21.mp4 → NAS ✓ 62K\n rsync soundcore AeroClip (input)_2026-05-13_08-00-50.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_08-01-20.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-01-50.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_08-02-20.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_08-02-50.mp4 → NAS ✓ 33K\n rsync soundcore AeroClip (input)_2026-05-13_08-03-20.mp4 → NAS ✓ 77K\n rsync soundcore AeroClip (input)_2026-05-13_08-03-50.mp4 → NAS ✓ 116K\n rsync soundcore AeroClip (input)_2026-05-13_08-04-20.mp4 → NAS ✓ 119K\n rsync soundcore AeroClip (input)_2026-05-13_08-04-50.mp4 → NAS ✓ 169K\n rsync soundcore AeroClip (input)_2026-05-13_08-05-20.mp4 → NAS ✓ 103K\n rsync soundcore AeroClip (input)_2026-05-13_08-05-50.mp4 → NAS ✓ 41K\n rsync soundcore AeroClip (input)_2026-05-13_08-06-20.mp4 → NAS ✓ 34K\n rsync soundcore AeroClip (input)_2026-05-13_08-06-50.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-07-20.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_08-07-50.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_08-08-20.mp4 → NAS ✓ 41K\n rsync soundcore AeroClip (input)_2026-05-13_08-08-50.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_08-09-20.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-09-50.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-10-20.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-10-58.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-11-30.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-12-00.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-12-30.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-13-00.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_08-13-30.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_08-14-00.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-14-30.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-15-00.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_08-15-30.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-16-00.mp4 → NAS ✓ 30K\n rsync soundcore AeroClip (input)_2026-05-13_08-16-30.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-17-00.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-17-30.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-18-00.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-18-30.mp4 → NAS ✓ 32K\n rsync soundcore AeroClip (input)_2026-05-13_08-19-00.mp4 → NAS ✓ 30K\n rsync soundcore AeroClip (input)_2026-05-13_08-19-30.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-20-00.mp4 → NAS ✓ 41K\n rsync soundcore AeroClip (input)_2026-05-13_08-20-30.mp4 → NAS ✓ 35K\n rsync soundcore AeroClip (input)_2026-05-13_08-21-00.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_08-21-30.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-22-00.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-22-30.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-23-00.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_08-23-30.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-24-00.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-24-30.mp4 → NAS ✓ 33K\n rsync soundcore AeroClip (input)_2026-05-13_08-25-00.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-25-30.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_08-26-00.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-26-29.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-26-59.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-27-29.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_08-27-59.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_08-28-29.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-28-59.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-29-29.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_08-29-59.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-30-29.mp4 → NAS ✓ 38K\n rsync soundcore AeroClip (input)_2026-05-13_08-30-59.mp4 → NAS ✓ 35K\n rsync soundcore AeroClip (input)_2026-05-13_08-31-29.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_08-31-59.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_08-32-29.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-32-59.mp4 → NAS ✓ 22K\n rsync soundcore AeroClip (input)_2026-05-13_08-33-29.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-33-59.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-34-29.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-34-59.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-35-29.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-35-59.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-36-29.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-36-59.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-37-53.mp4 → NAS ✓ 44K\n rsync soundcore AeroClip (input)_2026-05-13_08-38-25.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_08-38-55.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-39-25.mp4 → NAS ✓ 28K\n rsync soundcore AeroClip (input)_2026-05-13_08-39-55.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_08-40-25.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_08-40-55.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-41-25.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-41-55.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-42-25.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-42-55.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_08-43-25.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_08-43-55.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_08-44-25.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_08-44-55.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-45-25.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-45-55.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-46-25.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-46-55.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-47-25.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-47-55.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-48-25.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-48-55.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-49-25.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_08-49-55.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_08-50-25.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_08-50-55.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-51-25.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_08-51-55.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_08-52-25.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-52-55.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_08-53-25.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-53-55.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_08-54-25.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_08-54-55.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-55-25.mp4 → NAS ✓ 34K\n rsync soundcore AeroClip (input)_2026-05-13_08-55-55.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_08-56-25.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_08-56-55.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_08-57-25.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_08-57-55.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_08-58-25.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-58-55.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_08-59-25.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_08-59-55.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_09-00-25.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-00-55.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_09-01-25.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-01-55.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-02-25.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-02-55.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_09-03-25.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_09-03-55.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-04-25.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-04-55.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-05-25.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_09-05-55.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_09-06-25.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-07-06.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-07-39.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-08-08.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_09-08-38.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_09-09-08.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_09-09-38.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-10-08.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_09-10-38.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-11-08.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_09-11-38.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-12-08.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-12-38.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-13-08.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_09-13-38.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-14-08.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-14-38.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-15-08.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-15-38.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_09-16-08.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-16-38.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-17-08.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-17-38.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_09-18-08.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-18-56.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_09-19-28.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_09-19-58.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_09-20-28.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_09-21-21.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_09-21-53.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_09-22-23.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-22-53.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-23-23.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_09-23-53.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-24-23.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-24-53.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-25-23.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_09-25-53.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_09-26-23.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-26-53.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-27-23.mp4 → NAS ✓ 52K\n rsync soundcore AeroClip (input)_2026-05-13_09-27-53.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-28-23.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_09-28-53.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-29-23.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_09-29-53.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_09-30-23.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-30-53.mp4 → NAS ✓ 44K\n rsync soundcore AeroClip (input)_2026-05-13_09-31-23.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-31-53.mp4 → NAS ✓ 22K\n rsync soundcore AeroClip (input)_2026-05-13_09-32-23.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_09-32-53.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_09-33-23.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-33-53.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-34-23.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-34-53.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-35-23.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_09-35-53.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-36-23.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_09-36-53.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_09-37-23.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_09-37-53.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_09-38-23.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_09-38-53.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_09-39-22.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_09-39-52.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_09-40-36.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_09-41-08.mp4 → NAS ✓ 63K\n rsync soundcore AeroClip (input)_2026-05-13_09-41-38.mp4 → NAS ✓ 37K\n rsync soundcore AeroClip (input)_2026-05-13_09-42-30.mp4 → NAS ✓ 64K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-45-33.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-49-33.mp4 → NAS ✓ 196K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-50-03.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-50-33.mp4 → NAS ✓ 210K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-51-02.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-51-32.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-52-02.mp4 → NAS ✓ 202K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-52-32.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-53-02.mp4 → NAS ✓ 194K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-53-32.mp4 → NAS ✓ 189K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-54-02.mp4 → NAS ✓ 191K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-54-32.mp4 → NAS ✓ 196K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-55-02.mp4 → NAS ✓ 196K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-55-32.mp4 → NAS ✓ 197K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-56-02.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-56-32.mp4 → NAS ✓ 187K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-57-02.mp4 → NAS ✓ 189K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-57-31.mp4 → NAS ✓ 194K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-58-01.mp4 → NAS ✓ 192K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-58-31.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-59-01.mp4 → NAS ✓ 195K\n rsync MacBook Pro Microphone (input)_2026-05-13_09-59-31.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_10-00-01.mp4 → NAS ✓ 167K\n rsync MacBook Pro Microphone (input)_2026-05-13_10-00-25.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_10-00-42.mp4 → NAS ✓ 22K\n rsync soundcore AeroClip (input)_2026-05-13_10-01-14.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_10-01-44.mp4 → NAS ✓ 50K\n rsync soundcore AeroClip (input)_2026-05-13_10-02-14.mp4 → NAS ✓ 66K\n rsync soundcore AeroClip (input)_2026-05-13_10-02-44.mp4 → NAS ✓ 54K\n rsync soundcore AeroClip (input)_2026-05-13_10-03-14.mp4 → NAS ✓ 26K\n rsync soundcore AeroClip (input)_2026-05-13_10-03-43.mp4 → NAS ✓ 48K\n rsync soundcore AeroClip (input)_2026-05-13_10-04-13.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_10-04-43.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_10-05-13.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_10-05-43.mp4 → NAS ✓ 32K\n rsync soundcore AeroClip (input)_2026-05-13_10-06-13.mp4 → NAS ✓ 19K\n rsync soundcore AeroClip (input)_2026-05-13_10-06-43.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_10-07-13.mp4 → NAS ✓ 31K\n rsync soundcore AeroClip (input)_2026-05-13_10-07-43.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-08-13.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-08-43.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_10-09-13.mp4 → NAS ✓ 57K\n rsync soundcore AeroClip (input)_2026-05-13_10-09-43.mp4 → NAS ✓ 49K\n rsync soundcore AeroClip (input)_2026-05-13_10-10-13.mp4 → NAS ✓ 77K\n rsync soundcore AeroClip (input)_2026-05-13_10-10-43.mp4 → NAS ✓ 91K\n rsync soundcore AeroClip (input)_2026-05-13_10-11-13.mp4 → NAS ✓ 44K\n rsync soundcore AeroClip (input)_2026-05-13_10-11-43.mp4 → NAS ✓ 65K\n rsync soundcore AeroClip (input)_2026-05-13_10-12-13.mp4 → NAS ✓ 95K\n rsync soundcore AeroClip (input)_2026-05-13_10-12-43.mp4 → NAS ✓ 96K\n rsync soundcore AeroClip (input)_2026-05-13_10-13-13.mp4 → NAS ✓ 58K\n rsync soundcore AeroClip (input)_2026-05-13_10-13-43.mp4 → NAS ✓ 87K\n rsync soundcore AeroClip (input)_2026-05-13_10-14-13.mp4 → NAS ✓ 100K\n rsync soundcore AeroClip (input)_2026-05-13_10-14-43.mp4 → NAS ✓ 109K\n rsync soundcore AeroClip (input)_2026-05-13_10-15-13.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_10-15-43.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_10-16-13.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-16-50.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-17-23.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_10-17-53.mp4 → NAS ✓ 34K\n rsync soundcore AeroClip (input)_2026-05-13_10-18-23.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_10-18-53.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-19-23.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-19-53.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-20-23.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-20-53.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-21-23.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_10-21-53.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-22-49.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-23-36.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-24-08.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_10-24-38.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_10-25-08.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-25-38.mp4 → NAS ✓ 18K\n rsync soundcore AeroClip (input)_2026-05-13_10-26-08.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-26-38.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_10-27-08.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_10-27-38.mp4 → NAS ✓ 71K\n rsync soundcore AeroClip (input)_2026-05-13_10-28-08.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-28-38.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-29-08.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-29-38.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-30-08.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_10-30-38.mp4 → NAS ✓ 36K\n rsync soundcore AeroClip (input)_2026-05-13_10-31-08.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-31-43.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-32-15.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-32-45.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-33-15.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-33-45.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_10-34-15.mp4 → NAS ✓ 17K\n rsync soundcore AeroClip (input)_2026-05-13_10-34-45.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_10-35-15.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_10-35-45.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-36-15.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_10-36-45.mp4 → NAS ✓ 70K\n rsync soundcore AeroClip (input)_2026-05-13_10-37-15.mp4 → NAS ✓ 61K\n rsync soundcore AeroClip (input)_2026-05-13_10-37-45.mp4 → NAS ✓ 86K\n rsync soundcore AeroClip (input)_2026-05-13_10-38-15.mp4 → NAS ✓ 98K\n rsync soundcore AeroClip (input)_2026-05-13_10-38-45.mp4 → NAS ✓ 49K\n rsync soundcore AeroClip (input)_2026-05-13_10-39-15.mp4 → NAS ✓ 43K\n rsync soundcore AeroClip (input)_2026-05-13_10-39-45.mp4 → NAS ✓ 43K\n rsync soundcore AeroClip (input)_2026-05-13_10-40-15.mp4 → NAS ✓ 27K\n rsync soundcore AeroClip (input)_2026-05-13_10-40-45.mp4 → NAS ✓ 69K\n rsync soundcore AeroClip (input)_2026-05-13_10-41-15.mp4 → NAS ✓ 24K\n rsync soundcore AeroClip (input)_2026-05-13_10-41-45.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_10-42-15.mp4 → NAS ✓ 10K\n rsync soundcore AeroClip (input)_2026-05-13_10-42-45.mp4 → NAS ✓ 13K\n rsync soundcore AeroClip (input)_2026-05-13_10-43-15.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-43-45.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_10-44-15.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-44-54.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_10-45-27.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-45-56.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-46-26.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-46-56.mp4 → NAS ✓ 16K\n rsync soundcore AeroClip (input)_2026-05-13_10-47-26.mp4 → NAS ✓ 6.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-48-15.mp4 → NAS ✓ 20K\n rsync soundcore AeroClip (input)_2026-05-13_10-48-47.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-49-17.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_10-49-47.mp4 → NAS ✓ 5.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-50-18.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-50-50.mp4 → NAS ✓ 12K\n rsync soundcore AeroClip (input)_2026-05-13_10-51-20.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_10-51-50.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-52-20.mp4 → NAS ✓ 8.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-52-50.mp4 → NAS ✓ 11K\n rsync soundcore AeroClip (input)_2026-05-13_10-53-20.mp4 → NAS ✓ 23K\n rsync soundcore AeroClip (input)_2026-05-13_10-53-50.mp4 → NAS ✓ 40K\n rsync soundcore AeroClip (input)_2026-05-13_10-54-20.mp4 → NAS ✓ 25K\n rsync soundcore AeroClip (input)_2026-05-13_10-54-50.mp4 → NAS ✓ 34K\n rsync soundcore AeroClip (input)_2026-05-13_10-55-20.mp4 → NAS ✓ 41K\n rsync soundcore AeroClip (input)_2026-05-13_10-55-50.mp4 → NAS ✓ 42K\n rsync soundcore AeroClip (input)_2026-05-13_10-56-20.mp4 → NAS ✓ 43K\n rsync soundcore AeroClip (input)_2026-05-13_10-56-50.mp4 → NAS ✓ 83K\n rsync soundcore AeroClip (input)_2026-05-13_10-57-20.mp4 → NAS ✓ 29K\n rsync soundcore AeroClip (input)_2026-05-13_10-57-50.mp4 → NAS ✓ 7.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-58-20.mp4 → NAS ✓ 15K\n rsync soundcore AeroClip (input)_2026-05-13_10-58-50.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-59-20.mp4 → NAS ✓ 9.0K\n rsync soundcore AeroClip (input)_2026-05-13_10-59-50.mp4 → NAS ✓ 14K\n rsync soundcore AeroClip (input)_2026-05-13_11-00-20.mp4 → NAS ✓ 21K\n rsync soundcore AeroClip (input)_2026-05-13_11-00-50.mp4 → NAS ✓ 17K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-01-30.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-02-02.mp4 → NAS ✓ 194K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-02-32.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-03-02.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-03-32.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-04-02.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-04-32.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-05-02.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-05-32.mp4 → NAS ✓ 207K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-06-02.mp4 → NAS ✓ 205K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-06-31.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-07-01.mp4 → NAS ✓ 206K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-07-31.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-08-01.mp4 → NAS ✓ 195K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-08-31.mp4 → NAS ✓ 201K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-09-00.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-09-30.mp4 → NAS ✓ 194K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-10-00.mp4 → NAS ✓ 196K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-10-29.mp4 → NAS ✓ 193K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-10-59.mp4 → NAS ✓ 192K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-11-29.mp4 → NAS ✓ 191K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-11-59.mp4 → NAS ✓ 198K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-12-29.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-12-58.mp4 → NAS ✓ 209K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-13-28.mp4 → NAS ✓ 208K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-13-58.mp4 → NAS ✓ 203K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-14-28.mp4 → NAS ✓ 200K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-14-58.mp4 → NAS ✓ 218K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-15-28.mp4 → NAS ✓ 243K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-15-58.mp4 → NAS ✓ 215K\n rsync MacBook Pro Microphone (input)_2026-05-13_11-16-28.mp4 → NAS","is_focused":true},{"role":"AXRadioButton","text":"DOCKER","depth":2,"bounds":{"left":0.0,"top":0.05888889,"width":0.14097223,"height":0.026666667},"on_screen":true,"role_description":"radio button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close Tab","depth":3,"bounds":{"left":0.004166667,"top":0.06333333,"width":0.011111111,"height":0.017777778},"on_screen":true,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"DEV (-zsh)","depth":2,"bounds":{"left":0.14097223,"top":0.05888889,"width":0.140625,"height":0.026666667},"on_screen":true,"role_description":"radio button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close Tab","depth":3,"bounds":{"left":0.14513889,"top":0.06333333,"width":0.011111111,"height":0.017777778},"on_screen":true,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"APP (-zsh)","depth":2,"bounds":{"left":0.28159723,"top":0.05888889,"width":0.140625,"height":0.026666667},"on_screen":true,"role_description":"radio button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close Tab","depth":3,"bounds":{"left":0.2857639,"top":0.06333333,"width":0.011111111,"height":0.017777778},"on_screen":true,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"-zsh","depth":2,"bounds":{"left":0.42222223,"top":0.05888889,"width":0.140625,"height":0.026666667},"on_screen":true,"role_description":"radio button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close Tab","depth":3,"bounds":{"left":0.4263889,"top":0.06333333,"width":0.011111111,"height":0.017777778},"on_screen":true,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"screenpipe\"","depth":2,"bounds":{"left":0.5628472,"top":0.05888889,"width":0.140625,"height":0.026666667},"on_screen":true,"role_description":"radio button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close Tab","depth":3,"bounds":{"left":0.56701386,"top":0.06333333,"width":0.011111111,"height":0.017777778},"on_screen":true,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"ec2-user@ip-10-30-129-190:~ (rsync)","depth":2,"bounds":{"left":0.7034722,"top":0.05888889,"width":0.140625,"height":0.026666667},"on_screen":true,"role_description":"radio button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close Tab","depth":3,"bounds":{"left":0.70763886,"top":0.06333333,"width":0.011111111,"height":0.017777778},"on_screen":true,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"ec2-user@ip-10-20-31-146:~ (-zsh)","depth":2,"bounds":{"left":0.8440972,"top":0.05888889,"width":0.140625,"height":0.026666667},"on_screen":true,"role_description":"radio button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close Tab","depth":3,"bounds":{"left":0.84826386,"top":0.06333333,"width":0.011111111,"height":0.017777778},"on_screen":true,"role_description":"button","is_enabled":false,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"⌥⌘1","depth":1,"bounds":{"left":0.95625,"top":0.032222223,"width":0.03888889,"height":0.018888889},"on_screen":true,"automation_id":"_NS:8","role_description":"text"},{"role":"AXStaticText","text":"ec2-user@ip-10-30-129-190:~","depth":1,"bounds":{"left":0.42916667,"top":0.033333335,"width":0.14305556,"height":0.017777778},"on_screen":true,"role_description":"text"}]...
|
3024693169957058770
|
8995597396286221639
|
idle
|
accessibility
|
NULL
|
rsync MacBook Pro Microphone (input)_2026-05-12_11 rsync MacBook Pro Microphone (input)_2026-05-12_11-31-41.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_11-32-11.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_11-32-41.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_11-33-10.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_11-33-40.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_11-34-10.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_11-34-40.mp4 → NAS ✓ 201K
rsync MacBook Pro Microphone (input)_2026-05-12_11-35-10.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_11-35-40.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-36-09.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_11-36-39.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-37-09.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-37-39.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-38-09.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-38-39.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-39-09.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_11-39-39.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-40-09.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-40-39.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-41-09.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_11-41-39.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-42-09.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-42-38.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-43-08.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_11-43-38.mp4 → NAS ✓ 219K
rsync MacBook Pro Microphone (input)_2026-05-12_11-44-08.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_11-44-37.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_11-45-07.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_11-45-36.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_11-46-05.mp4 → NAS ✓ 209K
rsync MacBook Pro Microphone (input)_2026-05-12_11-46-35.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-47-05.mp4 → NAS ✓ 220K
rsync MacBook Pro Microphone (input)_2026-05-12_11-47-34.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_11-48-03.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-48-33.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_11-49-03.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-49-33.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_11-50-03.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_11-50-33.mp4 → NAS ✓ 210K
rsync MacBook Pro Microphone (input)_2026-05-12_11-51-02.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_11-51-32.mp4 → NAS ✓ 209K
rsync MacBook Pro Microphone (input)_2026-05-12_11-52-01.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_11-52-31.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-53-01.mp4 → NAS ✓ 210K
rsync MacBook Pro Microphone (input)_2026-05-12_11-53-30.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_11-54-00.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_11-54-30.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_11-55-00.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_11-55-29.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_11-55-59.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_11-56-29.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_11-56-59.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_11-57-29.mp4 → NAS ✓ 220K
rsync MacBook Pro Microphone (input)_2026-05-12_11-57-59.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_11-58-28.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_11-58-58.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_11-59-28.mp4 → NAS ✓ 218K
rsync MacBook Pro Microphone (input)_2026-05-12_11-59-58.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_12-00-27.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_12-00-57.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_12-01-27.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_12-01-57.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_12-02-27.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_12-02-57.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_12-03-27.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_12-03-57.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_12-04-27.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_12-04-57.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_12-05-26.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_12-05-56.mp4 → NAS ✓ 226K
rsync MacBook Pro Microphone (input)_2026-05-12_12-06-26.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_12-06-56.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_12-07-26.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_12-07-56.mp4 → NAS ✓ 197K
rsync MacBook Pro Microphone (input)_2026-05-12_12-08-26.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_12-08-55.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_12-09-25.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_12-09-55.mp4 → NAS ✓ 197K
rsync MacBook Pro Microphone (input)_2026-05-12_12-10-25.mp4 → NAS ✓ 197K
rsync MacBook Pro Microphone (input)_2026-05-12_12-10-55.mp4 → NAS ✓ 197K
rsync MacBook Pro Microphone (input)_2026-05-12_12-11-25.mp4 → NAS ✓ 210K
rsync MacBook Pro Microphone (input)_2026-05-12_12-11-55.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_12-12-24.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_12-12-54.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_12-13-24.mp4 → NAS ✓ 201K
rsync MacBook Pro Microphone (input)_2026-05-12_12-13-53.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_12-14-23.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_12-14-53.mp4 → NAS ✓ 220K
rsync MacBook Pro Microphone (input)_2026-05-12_12-15-23.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_12-15-53.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_12-16-23.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_12-16-53.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_12-17-53.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_12-18-22.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_12-18-52.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_12-19-22.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_12-19-52.mp4 → NAS ✓ 227K
rsync MacBook Pro Microphone (input)_2026-05-12_12-20-21.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_12-20-51.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_12-21-21.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_12-21-51.mp4 → NAS ✓ 210K
rsync MacBook Pro Microphone (input)_2026-05-12_12-22-21.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_12-22-51.mp4 → NAS ✓ 242K
rsync MacBook Pro Microphone (input)_2026-05-12_12-59-11.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_12-59-41.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_13-00-11.mp4 → NAS ✓ 194K
rsync MacBook Pro Microphone (input)_2026-05-12_13-35-39.mp4 → NAS ✓ 208K
rsync LakyLak bose qc35 II (input)_2026-05-12_14-08-11.mp4 → NAS ✓ 217K
rsync LakyLak bose qc35 II (input)_2026-05-12_14-10-40.mp4 → NAS ✓ 207K
rsync LakyLak bose qc35 II (input)_2026-05-12_14-15-09.mp4 → NAS ✓ 198K
rsync LakyLak bose qc35 II (input)_2026-05-12_14-21-38.mp4 → NAS ✓ 221K
rsync LakyLak bose qc35 II (input)_2026-05-12_14-22-38.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_17-21-56.mp4 → NAS ✓ 195K
rsync System Audio (output)_2026-05-12_17-21-56.mp4 → NAS ✓ 5.0K
rsync System Audio (output)_2026-05-12_17-22-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-22-28.mp4 → NAS ✓ 195K
rsync System Audio (output)_2026-05-12_17-22-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-22-58.mp4 → NAS ✓ 196K
rsync System Audio (output)_2026-05-12_17-23-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-23-28.mp4 → NAS ✓ 200K
rsync System Audio (output)_2026-05-12_17-23-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-23-58.mp4 → NAS ✓ 196K
rsync System Audio (output)_2026-05-12_17-24-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-24-28.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_17-24-58.mp4 → NAS ✓ 199K
rsync System Audio (output)_2026-05-12_17-24-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-25-28.mp4 → NAS ✓ 195K
rsync System Audio (output)_2026-05-12_17-25-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-25-58.mp4 → NAS ✓ 207K
rsync System Audio (output)_2026-05-12_17-25-58.mp4 → NAS ✓ 5.0K
rsync System Audio (output)_2026-05-12_17-26-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-26-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-26-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-26-58.mp4 → NAS ✓ 206K
rsync System Audio (output)_2026-05-12_17-27-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-27-28.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-27-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-27-58.mp4 → NAS ✓ 194K
rsync System Audio (output)_2026-05-12_17-28-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-28-28.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-12_17-28-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-28-58.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-29-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-29-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-29-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-29-58.mp4 → NAS ✓ 203K
rsync System Audio (output)_2026-05-12_17-30-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-30-28.mp4 → NAS ✓ 196K
rsync System Audio (output)_2026-05-12_17-30-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-30-58.mp4 → NAS ✓ 212K
rsync System Audio (output)_2026-05-12_17-31-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-31-28.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-31-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-31-58.mp4 → NAS ✓ 199K
rsync System Audio (output)_2026-05-12_17-32-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-32-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-32-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-32-58.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-12_17-33-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-33-28.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-33-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-33-58.mp4 → NAS ✓ 204K
rsync System Audio (output)_2026-05-12_17-34-28.mp4 → NAS ✓ 8.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-34-28.mp4 → NAS ✓ 207K
rsync System Audio (output)_2026-05-12_17-34-58.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-34-58.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-12_17-35-28.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-35-28.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-35-58.mp4 → NAS ✓ 8.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-35-58.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-12_17-36-28.mp4 → NAS ✓ 15K
rsync MacBook Pro Microphone (input)_2026-05-12_17-36-28.mp4 → NAS ✓ 211K
rsync System Audio (output)_2026-05-12_17-36-58.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-36-58.mp4 → NAS ✓ 213K
rsync System Audio (output)_2026-05-12_17-37-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-37-28.mp4 → NAS ✓ 209K
rsync System Audio (output)_2026-05-12_17-37-58.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-37-58.mp4 → NAS ✓ 206K
rsync System Audio (output)_2026-05-12_17-38-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-38-28.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-12_17-38-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-38-58.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-39-28.mp4 → NAS ✓ 12K
rsync MacBook Pro Microphone (input)_2026-05-12_17-39-28.mp4 → NAS ✓ 215K
rsync System Audio (output)_2026-05-12_17-39-58.mp4 → NAS ✓ 7.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-39-58.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-12_17-40-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-40-28.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-40-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-40-58.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-12_17-41-28.mp4 → NAS ✓ 9.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-41-28.mp4 → NAS ✓ 215K
rsync System Audio (output)_2026-05-12_17-41-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-41-58.mp4 → NAS ✓ 211K
rsync System Audio (output)_2026-05-12_17-42-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-42-28.mp4 → NAS ✓ 203K
rsync System Audio (output)_2026-05-12_17-42-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-42-58.mp4 → NAS ✓ 207K
rsync System Audio (output)_2026-05-12_17-43-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-43-28.mp4 → NAS ✓ 214K
rsync System Audio (output)_2026-05-12_17-43-58.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-43-58.mp4 → NAS ✓ 211K
rsync System Audio (output)_2026-05-12_17-44-28.mp4 → NAS ✓ 6.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-44-28.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-44-58.mp4 → NAS ✓ 7.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-44-58.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-45-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-45-28.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-45-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-45-58.mp4 → NAS ✓ 200K
rsync System Audio (output)_2026-05-12_17-46-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-46-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-46-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-46-58.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-47-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-47-28.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-12_17-47-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-47-58.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-48-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-48-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-48-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-48-58.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-49-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-49-28.mp4 → NAS ✓ 211K
rsync System Audio (output)_2026-05-12_17-49-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-49-58.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-12_17-50-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-50-28.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-50-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-50-58.mp4 → NAS ✓ 204K
rsync System Audio (output)_2026-05-12_17-51-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-51-28.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-51-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-51-58.mp4 → NAS ✓ 204K
rsync System Audio (output)_2026-05-12_17-52-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-52-28.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-52-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-52-58.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-53-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-53-28.mp4 → NAS ✓ 213K
rsync System Audio (output)_2026-05-12_17-53-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-53-58.mp4 → NAS ✓ 208K
rsync System Audio (output)_2026-05-12_17-54-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-54-28.mp4 → NAS ✓ 219K
rsync System Audio (output)_2026-05-12_17-54-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-54-58.mp4 → NAS ✓ 211K
rsync System Audio (output)_2026-05-12_17-55-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-55-28.mp4 → NAS ✓ 205K
rsync System Audio (output)_2026-05-12_17-55-58.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-55-58.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-12_17-56-28.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-12_17-56-28.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_17-58-58.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_17-59-28.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_18-00-28.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_18-00-58.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_18-01-58.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_18-04-58.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_18-08-27.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_18-09-27.mp4 → NAS ✓ 208K
rsync MacBook Pro Microphone (input)_2026-05-12_18-09-57.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_18-10-27.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-10-57.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_18-11-27.mp4 → NAS ✓ 218K
rsync MacBook Pro Microphone (input)_2026-05-12_18-11-57.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_18-12-57.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_18-14-57.mp4 → NAS ✓ 228K
rsync MacBook Pro Microphone (input)_2026-05-12_18-15-27.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_18-15-57.mp4 → NAS ✓ 213K
rsync MacBook Pro Microphone (input)_2026-05-12_18-16-57.mp4 → NAS ✓ 221K
rsync MacBook Pro Microphone (input)_2026-05-12_18-19-27.mp4 → NAS ✓ 215K
rsync MacBook Pro Microphone (input)_2026-05-12_18-19-57.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_18-20-27.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_18-20-56.mp4 → NAS ✓ 211K
rsync MacBook Pro Microphone (input)_2026-05-12_18-21-26.mp4 → NAS ✓ 204K
rsync MacBook Pro Microphone (input)_2026-05-12_18-21-56.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_18-22-26.mp4 → NAS ✓ 209K
rsync MacBook Pro Microphone (input)_2026-05-12_18-22-56.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_18-23-26.mp4 → NAS ✓ 212K
rsync MacBook Pro Microphone (input)_2026-05-12_18-23-56.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_18-24-26.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_18-24-56.mp4 → NAS ✓ 217K
rsync MacBook Pro Microphone (input)_2026-05-12_18-25-56.mp4 → NAS ✓ 214K
rsync MacBook Pro Microphone (input)_2026-05-12_18-26-26.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-26-56.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_18-28-26.mp4 → NAS ✓ 216K
rsync MacBook Pro Microphone (input)_2026-05-12_18-28-56.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_18-29-56.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-30-56.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-31-56.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-32-26.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-32-56.mp4 → NAS ✓ 201K
rsync MacBook Pro Microphone (input)_2026-05-12_18-33-56.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-34-56.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-35-26.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-35-56.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_18-36-26.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-12_18-36-56.mp4 → NAS ✓ 205K
rsync MacBook Pro Microphone (input)_2026-05-12_18-37-26.mp4 → NAS ✓ 201K
rsync MacBook Pro Microphone (input)_2026-05-12_18-37-56.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-38-26.mp4 → NAS ✓ 199K
rsync MacBook Pro Microphone (input)_2026-05-12_18-38-56.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_18-39-26.mp4 → NAS ✓ 200K
rsync MacBook Pro Microphone (input)_2026-05-12_18-39-56.mp4 → NAS ✓ 195K
rsync MacBook Pro Microphone (input)_2026-05-12_18-40-26.mp4 → NAS ✓ 198K
rsync MacBook Pro Microphone (input)_2026-05-12_18-40-56.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-12_18-41-26.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-12_18-42-26.mp4 → NAS ✓ 209K
rsync MacBook Pro Microphone (input)_2026-05-12_18-43-26.mp4 → NAS ✓ 207K
rsync MacBook Pro Microphone (input)_2026-05-12_18-44-26.mp4 → NAS ✓ 210K
rsync MacBook Pro Microphone (input)_2026-05-12_18-45-26.mp4 → NAS ✓ 207K
audio files total: 1113 file(s), 145M
[+09m26s] ▶ Copying screenpipe logs for 2026-05-12
rsync logs → NAS ✓ 1 file(s), 288K
[2026-05-13 21:46:00] Archive DB size: 2.0G
[2026-05-13 21:46:00] Total time: 9m26s
[2026-05-13 21:46:00] Sync complete for 2026-05-12
[2026-05-13 21:46:00] ========================================
lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13
[2026-05-14 09:28:19] ========================================
[2026-05-14 09:28:19] Screenpipe sync starting for: 2026-05-13
[2026-05-14 09:28:19] ========================================
[+00m00s] ▶ Preflight checks
Source DB: OK (5.6G)
[2026-05-14 09:28:19] ERROR: NAS not mounted at /Volumes/screenpipe
lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13
[2026-05-14 09:28:31] ========================================
[2026-05-14 09:28:31] Screenpipe sync starting for: 2026-05-13
[2026-05-14 09:28:31] ========================================
[+00m00s] ▶ Preflight checks
Source DB: OK (5.6G)
NAS mount: OK /Volumes/screenpipe
Archive DB: exists (2.0G)
Data dir: OK (263 files, 541M)
[+00m04s] ▶ Counting source rows for 2026-05-13
frames: 9586
elements: 1272090
ui_events: 9151
ocr_text: 2829
meetings: 0
audio_chunks: 1295
audio_transcriptions: 102
[+00m06s] ▶ Initialising tables, indexes, FTS
creating tables ✓ 0m00s
creating indexes ✓ 0m00s
creating FTS tables ✓ 0m00s
[+00m06s] ▶ Syncing vision data for 2026-05-13
video_chunks ✓ 0m01s
frames (9586 rows) ✓ 2m16s
ocr_text (2829 rows) ✓ 1m22s
ui_events (9151 rows) ✓ 0m01s
elements (1272090 rows) ✓ 1m26s
meetings (0 rows) ⠋ Runtime error near line 2: database is locked (5)
Parse error near line 3: no such table: nas.meetings
Runtime error near line 5: no such database: nas
lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ~/.screenpipe/scripts/screenpipe_sync.sh 2026-05-13
[2026-05-14 09:34:11] ========================================
[2026-05-14 09:34:11] Screenpipe sync starting for: 2026-05-13
[2026-05-14 09:34:11] ========================================
[+00m00s] ▶ Preflight checks
Source DB: OK (5.6G)
NAS mount: OK /Volumes/screenpipe
[2026-05-14 09:34:11] Date 2026-05-13 already has 9586 frames in archive — skipping DB sync
Data dir: OK (263 files, 541M)
[+00m00s] ▶ Copying data folder for 2026-05-13
rsync 2026-05-13/ → NAS ✓ 0m34s (263 files, 524M)
[+00m34s] ▶ Copying audio files for 2026-05-13
rsync System Audio (output)_2026-05-13_06-16-53.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-16-53.mp4 → NAS ✓ 188K
rsync System Audio (output)_2026-05-13_06-17-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-17-25.mp4 → NAS ✓ 190K
rsync System Audio (output)_2026-05-13_06-17-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-17-55.mp4 → NAS ✓ 192K
rsync System Audio (output)_2026-05-13_06-18-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-18-25.mp4 → NAS ✓ 191K
rsync System Audio (output)_2026-05-13_06-18-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-18-55.mp4 → NAS ✓ 191K
rsync System Audio (output)_2026-05-13_06-19-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-19-25.mp4 → NAS ✓ 197K
rsync System Audio (output)_2026-05-13_06-19-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-19-55.mp4 → NAS ✓ 194K
rsync System Audio (output)_2026-05-13_06-20-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-20-25.mp4 → NAS ✓ 195K
rsync MacBook Pro Microphone (input)_2026-05-13_06-20-55.mp4 → NAS ✓ 187K
rsync System Audio (output)_2026-05-13_06-20-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-21-25.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-13_06-21-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-21-55.mp4 → NAS ✓ 190K
rsync System Audio (output)_2026-05-13_06-21-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-22-25.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-13_06-22-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-22-55.mp4 → NAS ✓ 192K
rsync System Audio (output)_2026-05-13_06-22-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-23-25.mp4 → NAS ✓ 198K
rsync System Audio (output)_2026-05-13_06-23-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-23-55.mp4 → NAS ✓ 199K
rsync System Audio (output)_2026-05-13_06-23-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-24-25.mp4 → NAS ✓ 195K
rsync System Audio (output)_2026-05-13_06-24-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-24-55.mp4 → NAS ✓ 209K
rsync System Audio (output)_2026-05-13_06-24-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-25-25.mp4 → NAS ✓ 203K
rsync System Audio (output)_2026-05-13_06-25-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-25-55.mp4 → NAS ✓ 210K
rsync System Audio (output)_2026-05-13_06-25-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-26-25.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-13_06-26-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-26-55.mp4 → NAS ✓ 198K
rsync System Audio (output)_2026-05-13_06-26-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-27-25.mp4 → NAS ✓ 199K
rsync System Audio (output)_2026-05-13_06-27-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-27-55.mp4 → NAS ✓ 200K
rsync System Audio (output)_2026-05-13_06-27-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-28-25.mp4 → NAS ✓ 200K
rsync System Audio (output)_2026-05-13_06-28-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-28-55.mp4 → NAS ✓ 213K
rsync System Audio (output)_2026-05-13_06-28-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-29-25.mp4 → NAS ✓ 233K
rsync System Audio (output)_2026-05-13_06-29-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-29-55.mp4 → NAS ✓ 215K
rsync System Audio (output)_2026-05-13_06-29-55.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-30-24.mp4 → NAS ✓ 197K
rsync System Audio (output)_2026-05-13_06-30-25.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-30-54.mp4 → NAS ✓ 195K
rsync System Audio (output)_2026-05-13_06-30-54.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-31-24.mp4 → NAS ✓ 198K
rsync System Audio (output)_2026-05-13_06-31-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-31-54.mp4 → NAS ✓ 192K
rsync System Audio (output)_2026-05-13_06-31-54.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-32-24.mp4 → NAS ✓ 197K
rsync System Audio (output)_2026-05-13_06-32-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-32-54.mp4 → NAS ✓ 201K
rsync System Audio (output)_2026-05-13_06-32-54.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-33-24.mp4 → NAS ✓ 199K
rsync System Audio (output)_2026-05-13_06-33-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-33-54.mp4 → NAS ✓ 202K
rsync System Audio (output)_2026-05-13_06-33-54.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-34-24.mp4 → NAS ✓ 188K
rsync System Audio (output)_2026-05-13_06-34-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-34-54.mp4 → NAS ✓ 184K
rsync System Audio (output)_2026-05-13_06-34-54.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-35-24.mp4 → NAS ✓ 192K
rsync System Audio (output)_2026-05-13_06-35-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-35-54.mp4 → NAS ✓ 200K
rsync System Audio (output)_2026-05-13_06-35-54.mp4 → NAS ✓ 5.0K
rsync System Audio (output)_2026-05-13_06-36-24.mp4 → NAS ✓ 5.0K
rsync MacBook Pro Microphone (input)_2026-05-13_06-36-24.mp4 → NAS ✓ 189K
rsync MacBook Pro Microphone (input)_2026-05-13_06-36-54.mp4 → NAS ✓ 120K
rsync MacBook Pro Microphone (input)_2026-05-13_06-37-12.mp4 → NAS ✓ 16K
rsync System Audio (output)_2026-05-13_06-36-54.mp4 → NAS ✓ 24K
rsync soundcore AeroClip (input)_2026-05-13_06-37-12.mp4 → NAS ✓ 103K
rsync System Audio (output)_2026-05-13_06-37-23.mp4 → NAS ✓ 191K
rsync soundcore AeroClip (input)_2026-05-13_06-37-44.mp4 → NAS ✓ 64K
rsync System Audio (output)_2026-05-13_06-37-53.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-38-14.mp4 → NAS ✓ 86K
rsync System Audio (output)_2026-05-13_06-38-23.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-38-44.mp4 → NAS ✓ 23K
rsync System Audio (output)_2026-05-13_06-38-53.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-39-14.mp4 → NAS ✓ 86K
rsync System Audio (output)_2026-05-13_06-39-23.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-39-44.mp4 → NAS ✓ 82K
rsync System Audio (output)_2026-05-13_06-39-53.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-40-14.mp4 → NAS ✓ 68K
rsync System Audio (output)_2026-05-13_06-40-23.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-40-44.mp4 → NAS ✓ 68K
rsync System Audio (output)_2026-05-13_06-40-53.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-41-14.mp4 → NAS ✓ 171K
rsync System Audio (output)_2026-05-13_06-41-23.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-41-44.mp4 → NAS ✓ 67K
rsync System Audio (output)_2026-05-13_06-41-53.mp4 → NAS ✓ 16K
rsync soundcore AeroClip (input)_2026-05-13_06-42-14.mp4 → NAS ✓ 14K
rsync System Audio (output)_2026-05-13_06-42-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-42-44.mp4 → NAS ✓ 19K
rsync System Audio (output)_2026-05-13_06-42-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-43-14.mp4 → NAS ✓ 93K
rsync System Audio (output)_2026-05-13_06-43-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-43-44.mp4 → NAS ✓ 157K
rsync System Audio (output)_2026-05-13_06-43-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-44-14.mp4 → NAS ✓ 128K
rsync System Audio (output)_2026-05-13_06-44-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-44-44.mp4 → NAS ✓ 134K
rsync System Audio (output)_2026-05-13_06-44-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-45-14.mp4 → NAS ✓ 113K
rsync System Audio (output)_2026-05-13_06-45-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-45-44.mp4 → NAS ✓ 113K
rsync System Audio (output)_2026-05-13_06-45-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-46-14.mp4 → NAS ✓ 148K
rsync System Audio (output)_2026-05-13_06-46-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-46-44.mp4 → NAS ✓ 75K
rsync System Audio (output)_2026-05-13_06-46-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-47-14.mp4 → NAS ✓ 10K
rsync System Audio (output)_2026-05-13_06-47-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-47-44.mp4 → NAS ✓ 31K
rsync System Audio (output)_2026-05-13_06-47-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-48-14.mp4 → NAS ✓ 16K
rsync System Audio (output)_2026-05-13_06-48-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-48-44.mp4 → NAS ✓ 24K
rsync System Audio (output)_2026-05-13_06-48-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-49-14.mp4 → NAS ✓ 11K
rsync System Audio (output)_2026-05-13_06-49-22.mp4 → NAS ✓ 5.0K
rsync System Audio (output)_2026-05-13_06-49-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-49-51.mp4 → NAS ✓ 10K
rsync System Audio (output)_2026-05-13_06-50-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-50-23.mp4 → NAS ✓ 69K
rsync System Audio (output)_2026-05-13_06-50-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-50-53.mp4 → NAS ✓ 67K
rsync System Audio (output)_2026-05-13_06-51-22.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-51-23.mp4 → NAS ✓ 23K
rsync System Audio (output)_2026-05-13_06-51-52.mp4 → NAS ✓ 5.0K
rsync soundcore AeroClip (input)_2026-05-13_06-51-54.mp4 → NAS ✓ 63K
rsync MacBook Pro Microphone (input)_2026-05-13_06-52-42.mp4 → NAS ✓ 27K
rsync MacBook Pro Microphone (input)_2026-05-13_06-52-48.mp4 → NAS ✓ 15K
rsync System Audio (output)_2026-05-13_06-52-22.mp4 → NAS ✓ 5.0K
rsync System Audio (output)_2026-05-13_06-52-52.mp4 → NAS ✓ 5.0K
rsync LakyLak bose qc35 II (input)_2026-05-13_06-52-58.mp4 → NAS ✓ 183K
rsync System Audio (output)_2026-05-13_06-53-22.mp4 → NAS ✓ 5.0K
rsync LakyLak bose qc35 II (input)_2026-05-13_06-53-30.mp4 → NAS ✓ 191K
rsync System Audio (output)_2026-05-13_06-53-52.mp4 → NAS ✓ 5.0K
rsync LakyLak bose qc35 II (input)_2026-05-13_06-54-00.mp4 → NAS ✓ 191K
rsync System Audio (output)_2026-05-13_06-54-22.mp4 → NAS ✓ 5.0K
rsync LakyLak bose qc35 II (input)_2026-05-13_06-54-30.mp4 → NAS ✓ 187K
rsync System Audio (output)_2026-05-13_06-54-52.mp4 → NAS ✓ 5.0K
rsync LakyLak bose qc35 II (input)_2026-05-13_06-55-00.mp4 → NAS ✓ 188K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-00-33.mp4 → NAS ✓ 215K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-01-03.mp4 → NAS ✓ 212K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-01-33.mp4 → NAS ✓ 202K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-02-03.mp4 → NAS ✓ 220K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-02-33.mp4 → NAS ✓ 196K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-03-03.mp4 → NAS ✓ 196K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-03-33.mp4 → NAS ✓ 199K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-04-03.mp4 → NAS ✓ 201K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-04-33.mp4 → NAS ✓ 198K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-05-03.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-05-33.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-06-03.mp4 → NAS ✓ 198K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-06-33.mp4 → NAS ✓ 194K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-07-03.mp4 → NAS ✓ 196K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-07-33.mp4 → NAS ✓ 200K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-08-03.mp4 → NAS ✓ 203K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-08-33.mp4 → NAS ✓ 214K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-09-03.mp4 → NAS ✓ 216K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-09-33.mp4 → NAS ✓ 196K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-10-03.mp4 → NAS ✓ 194K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-10-33.mp4 → NAS ✓ 198K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-11-03.mp4 → NAS ✓ 200K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-11-33.mp4 → NAS ✓ 237K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-12-03.mp4 → NAS ✓ 227K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-12-33.mp4 → NAS ✓ 225K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-13-03.mp4 → NAS ✓ 217K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-13-33.mp4 → NAS ✓ 204K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-14-03.mp4 → NAS ✓ 202K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-14-33.mp4 → NAS ✓ 204K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-15-03.mp4 → NAS ✓ 203K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-15-33.mp4 → NAS ✓ 207K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-16-03.mp4 → NAS ✓ 203K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-16-33.mp4 → NAS ✓ 202K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-17-03.mp4 → NAS ✓ 199K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-17-33.mp4 → NAS ✓ 207K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-18-03.mp4 → NAS ✓ 203K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-18-33.mp4 → NAS ✓ 202K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-19-03.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-19-33.mp4 → NAS ✓ 204K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-20-03.mp4 → NAS ✓ 200K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-20-33.mp4 → NAS ✓ 195K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-21-03.mp4 → NAS ✓ 196K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-21-33.mp4 → NAS ✓ 201K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-22-03.mp4 → NAS ✓ 214K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-25-32.mp4 → NAS ✓ 220K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-27-32.mp4 → NAS ✓ 239K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-28-02.mp4 → NAS ✓ 212K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-28-32.mp4 → NAS ✓ 213K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-29-02.mp4 → NAS ✓ 227K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-29-32.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-30-02.mp4 → NAS ✓ 197K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-30-32.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-31-02.mp4 → NAS ✓ 213K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-31-32.mp4 → NAS ✓ 204K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-32-02.mp4 → NAS ✓ 203K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-32-32.mp4 → NAS ✓ 211K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-33-02.mp4 → NAS ✓ 206K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-33-32.mp4 → NAS ✓ 211K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-34-02.mp4 → NAS ✓ 208K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-34-32.mp4 → NAS ✓ 210K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-35-02.mp4 → NAS ✓ 208K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-35-32.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-36-02.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-36-32.mp4 → NAS ✓ 216K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-37-02.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-37-32.mp4 → NAS ✓ 207K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-38-02.mp4 → NAS ✓ 206K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-38-32.mp4 → NAS ✓ 226K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-39-02.mp4 → NAS ✓ 206K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-39-32.mp4 → NAS ✓ 214K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-40-02.mp4 → NAS ✓ 212K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-40-32.mp4 → NAS ✓ 214K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-41-02.mp4 → NAS ✓ 214K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-41-32.mp4 → NAS ✓ 205K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-42-02.mp4 → NAS ✓ 201K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-42-32.mp4 → NAS ✓ 207K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-43-02.mp4 → NAS ✓ 210K
rsync LakyLak bose qc35 II (input)_2026-05-13_07-53-01.mp4 → NAS ✓ 223K
rsync MacBook Pro Microphone (input)_2026-05-13_07-53-37.mp4 → NAS ✓ 206K
rsync MacBook Pro Microphone (input)_2026-05-13_07-54-09.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-13_07-54-39.mp4 → NAS ✓ 203K
rsync MacBook Pro Microphone (input)_2026-05-13_07-55-09.mp4 → NAS ✓ 202K
rsync MacBook Pro Microphone (input)_2026-05-13_07-55-39.mp4 → NAS ✓ 202K
...
|
39084
|
NULL
|
NULL
|
NULL
|